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An  Extended  Semantic  Definition  of  Pascal 
for  Proving  the  Absence  of  Common  Runtime  Errors 

by  Steven  M.  German 


We  present  an  axiomatic  definition  of  Pascal  which  is  the  logical  basis  of  the  Runcheck 
system,  a  working  verifier  for  proving  the  absence  of  runtime  errors  such  as  arithmetic 
overflow,  array  subscripting  out  range,  and  accessing  an  uninitialized  variable.  Such 
errors  cannot  be  detected  at  compile  time  by  most  compilers.  Because  the  occurrence 
of  a  runtime  error  may  depend  on  the  values  of  data  supplied  to  a  program,  techniques 
for  assuring  the  absence  of  errors  must  be  based  on  program  specifications.  Runcheck 
accepts  Pascal  programs  documented  with  assertions,  and  proves  that  the 
specifications  are  consistent  with  the  program  and  that  no  runtime  errors  can  occur.  Our 
axiomatic  definition  is  sImHar  to  Hoare'a  axiom  system,  but  it  takes  into  account  certain 
restrictions  that  have  not  been  considered  In  previous  definitions.  For  instance,  our 
definition  accurately  models  uninitialized  variables,  and  requires  a  variable  to  have  a 
well  defined  value  before  it  can  be  accessed.  The  logical  problems  of  Introducing  the 
concept  of  uninitialized  variables  are  discussed.  Our  definition  of  expression  evaluation 
deals  more  fully  with  function  calls  than  previous  axiomatic  definitions. 

Some  generalizations  of  our  semantics  are  presented,  including  a  new  method  for 
verifying  programs  with  procedure  and  function  parameters.  Our  semantics  can  be  easily 
adopted  to  similar  languages,  such  as  AOA. 

One  of  the  main  potential  problems  for  the  user  of  a  verifier  is  the  need  to  write 
detailed,  repetitious  assertions.  We  develop  some  simple  logical  properties  of  our 
definition  which  are  exploited  by  Runcheck  to  reduce  the  need  for  such  detailed 
assertions,  s _ 
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1.  Introduction 

In  most  programming  languages,  there  are  various  undefined  conditions  and  illegal 
operations  such  as  arithmetic  overflow  and  array  subscripting  out  of  range.  We  call 
these  conditions  runtime  errors  because  they  are  violations  of  language  or 
implementation  imposed  restrictions  on  program  execution.  Current  compilers  do  not 
attempt  to  detect  runtime  errors  during  compilation,  though  they  commonly  insert 
special  code  to  test  for  certain  errors  during  execution.  This  approach  is  costly  in 
execution  time  and  compiled  program  size,  and  of  course  gives  no  assurance  that  a 
program  will  run  to  completion. 

The  occurrence  of  a  runtime  error  may  depend  on  the  values  of  data  supplied  to  a 
program.  For  this  reason,  any  technique  for  assuring  the  absence  of  runtime  errors  must 
be  based  on  some  method  for  specifying  programs.  Showing  the  absence  of  runtime 
errors  is  thus  a  natural  problem  in  program  verification. 

We  have  been  developing  an  automatic  verifier  for  proving  the  absence  of  runtime 
errors  in  the  language  Pascal.  The  Runcheck  system  takes  as  input  a  Pascal  program 
with  entry,  exit  and  optional  invariant  assertions,  and  proves  that  the  specifications  are 
consistent  with  the  program  and  that  no  runtime  errors  can  occur.  Invariant  assertions 
are  not  required  in  many  cases  because  the  system  is  able  to  generate  simple  invariants 
automatically,  but  more  subtle  invariants  must  be  supplied  by  the  user.  The  system 
currently  checks  for  the  following  kinds  of  errors:  accessing  a  variable  that  has  not  been 
assigned  a  value,  array  subscripting  out  of  range,  subrange  type  error,  dereferencing  a 
NIL  pointer,  arithmetic  overflow,  division  by  zero,  control  stack  overflow,  exceeding 
heap  storage  bounds,  and  UNION 1  type  selection  errors.  The  verifier  and  our  semantic 
definition  of  Pascal  do  not  yet  include  REAL  or  SET  types,  but  pointers  are  permitted. 

Obviously,  the  notion  of  runtime  error  does  not  include  every  kind  of  programming 

1  Tha  ItBfMf  accaptad  by  tha  varifiar  mctudaa  varif iabla  UNION  typaa  inataad  of  Paacafr  variant  racorda.  ftefdr  to  [3J  for  » 
diaenaalnn  of  tha  grab  lama  of  variants  and  tha  datatta  of  our  UNION  typaa. 
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error.  The  runtime  errors  for  a  lan gauge  are  the  conditions  under  which  programs 
cannot  continue  to  execute  or  continued  execution  would  give  undetermined  results.  For 
a  program  to  be  useful,  one  needs  to  know  more  about  it  than  that  it  does  not  have 
runtime  errors.  Consider  a  program  which  is  intended  to  copy  a  list  made  of  pointers 
and  records;  it  can  have  an  error  which  causes  it  to  produce  the  wrong  result  without 
any  runtime  errors  in  the  sense  we  are  using.  Runcheck  makes  it  possible  to  verify  such 
a  program  at  several  levels  of  detail.  For  the  least  detailed  verification,  the  program  is 
submitted  to  Runcheck  without  additional  specifications  related  to  list  copying.  In  this 
case,  Runcheck  attempts  to  prove  only  that  the  program  is  free  from  runtime  errors.  In 
general,  it  may  be  necessary  for  the  user  to  supply  some  specifications  and  invariants 
even  at  this  level  of  detail.  For  instance,  the  program  may  have  a  control  stack 
overflow  unless  the  input  is  acyclic  User  supplied  invariants  would  be  needed  in  case 
the  simple  invariants  generated  automatically  by  the  system  are  not  sufficient  to  prove 
absence  of  runtime  errors.  A  more  detailed  verification  could  be  obtained  by  adding 
specifications  saying  that  the  result  of  the  program  is  a  copy  of  the  input.  An  even 
more  detailed  verification  could  establish  bounds  on  the  performance  of  the  program, 
such  as  the  maximum  number  of  times  each  statement  is  executed  as  a  function  of  the 
input  [10]. 

The  purpose  of  Runcheck  is  to  automate  the  routine  aspects  of  the  least  detailed 
verifications,  while  still  allowing  the  user  to  supply  additional  information  for  more 
detailed  verifications.  Thus  although  Runcheck  is  primarily  used  to  perform  shallow 
verifications,  it  provides  a  general  logical  framework  for  proving  detailed  properties. 
Every  program  verified  by  Runcheck  is  assured  to  have,  as  a  minimum,  the  property 
that  no  runtime  errors  can  occur  if  the  entry  assertion  is  satisfied. 

This  paper  is  concerned  with  an  extended  axiomatic  definition  of  Pascal,  which  is  the 
logical  basis  of  Runcheck.  The  extended  definition  is  similar  to  the  familiar  Hoare 
axiom  system  [6],  but  it  takes  into  account  certain  restrictions  on  the  computation  that 
have  not  been  considered  in  previous  axiomatic  language  definitions. 
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Although  the  details  of  our  semantic  definition  refer  specifically  to  Pascal,  most  of  the 
ideas  are  broadly  applicable  The  runtime  errors  which  exist  in  Pascal  are  also  present 
in  many  other  languages,  and  the  ideas  in  our  semantic  definition  can  be  adopted  to 
other  languages  with  additional  kinds  of  errors.  ADA  [7]  is  an  especially  interesting 
case  it  should  be  possible  to  define  much  of  the  language  by  generalizing  our  definition 
of  Pascal.  For  instance  the  problem  of  generalizing  our  definition  to  allow  dynamic 
subrange  types  is  discussed  briefly  in  section  8.1. 

Our  axiomatic  definition  of  Pascal  consists  of  some  first  order  theories  plus  axioms  and 
inference  rules  for  reasoning  about  programs.  One  of  the  first  order  theories  concerns  a 
predicate,  DEF(x),  which  is  true  of  expressions  having  a  well  defined  value.  The  other 
first  order  theories  are  familiar  ones  such  as  arithmetic.  Runcheck  is  more  than  a  direct 
implementation  of  these  logical  components;  a  practical  program  verifier  should  provide 
as  much  assistance  as  possible,  eg.,  in  generating  inductive  assertions.  All  of  the 
example  programs  discussed  in  this  paper  have  been  handled  completely  automatically 
by  the  system. 

Practical  results  with  Runcheck  have  been  reported  in  [2].  An  earlier  approach  to 
formalizing  the  extended  semantics  is  presented  in  collaboration  with  D.  Luckham  and 
D.  Oppen  in  [4]. 

The  theorems  in  the  Hoare  axiom  system  are  of  the  form,  P{A}Q.  Intuitively,  this 
formula  states  that  if  P  holds  before  executing  a  program  A,  then  if  and  when  A 
terminates,  Q,  will  hold.  In  [5,6]  and  elsewhere,  the  relation  P{A}Q  is  taken  to  be  true 
if  there  is  a  runtime  error  in  executing  A.  Hoare  chose  to  make  the  interpretation  that 
if  an  error  occurred,  the  effect  of  the  program  would  be  "undefined,"  as  if  the  program 
had  failed  to  terminate. 

In  our  extended  semantics,  PffA])Q,  is  defined  to  mean  that  if  P  holds,  then  A  executes 
without  runtime  errors,  and  if  A  terminates  Q,  will  hold.  Since  virtually  all  programs 
are  intended  to  execute  without  runtime  errors,  a  proof  of  P([A])Q  is  much  more  useful 
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than  one  of  P{A}Q,  from  a  practical  point  of  view.2  If  it  is  possible  to  verify  the  absence 
of  runtime  errors  in  a  program,  the  implementation  can  omit  the  usual  runtime  error 
checking  code  —  an  increase  of  efficiency  without  loss  of  reliability.  Also,  the  extended 
semantics  is  a  convenient  system  for  showing  the  absence  of  certain  errors  in  programs 
that  are  not  intended  to  terminate. 

Our  proof  system  is  general  purpose  in  that  any  partial  correctness  specification  can  be 
expressed  by  choosing  P  and  Q.  Absence  of  runtime  errors  is  proven  together  with 
other  properties.  There  are  other  possible  formulations;  one  could  develop  a  proof 
system  based  on  statements  of  the  form  SAFE[P,  A],  meaning  that  if  P  holds 
beforehand,  then  A  executes  without  runtime  error.  The  disadvantage  of  such  a  system 
is  that  proofs  of  the  absence  of  runtime  errors  often  require  lemmas  about  more  general 
properties  of  the  program. 

For  example,  consider  a  simple  program  which  searches  in  an  array  A  for  an  element 
equal  to  KEY.  The  elements  are  stored  in  A[1X-  •  •  ,A[N-1].  The  fast  linear  search 
stores  the  key  in  the  last  position  of  the  array  A  before  searching,  so  that  the  search 
loop  does  not  have  to  test  whether  the  index  has  become  greater  than  N.  The  result  of 
the  search  is  returned  in  the  variable  I. 

Example  1:  Fast  Linear  Table  Search. 

VAR  N:INTEGER; 

TYPE  ARR-ARRAYC1  :N]  OF  INTEGER; 

PROCEDURE  SEARCH(KEY:INTEGER;  A:ARR;  VAR  I:INTEGER); 

GLOBAL  (N); 

ENTRY  DEF(N)  a  IsN  a  NsMAXINT; 

BEGIN 

ACN]:«KEY; 

I:«1; 

WHILE  ACI>KEY  DO  I:«I+1 ; 

END; 
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This  program  depends  on  the  fact  that  A[N]  has  the  value  KEY  throughout  execution 
of  the  loop.  Otherwise,  if  the  key  was  not  found  in  A,  the  loop  would  continue  and 
attempt  to  access  A[N+1],  causing  a  subscripting  error.  It  is  necessary  to  prove  that 
A[N]-KEY  is  an  invariant  of  the  loop,  and  in  our  extended  semantics,  such  lemmas  can 
be  proven  together  in  one  step  with  the  proof  of  absence  of  runtime  errors. 

The  procedure  SEARCH  is  presented  to  the  Runcheck  system  with  an  ENTRY  assertion 
stating  that  N  has  a  value  between  1  and  MAXINT,  the  largest  integer.  The  system  is 
able  in  this  case  to  verify  absence  of  subscripting  errors,  arithmetic  overflow,  and 
uninitialized  variable  errors  (the  use  of  the  value  of  a  variable  before  it  has  been 
assigned  a  value),  automatically,  given  only  the  ENTRY  assertion  and  program  text  as 
shown  in  Example  1.  In  particular,  the  necessary  loop  invariants  including  A[N]-KEY 
are  generated  automatically  without  any  effort  on  the  part  of  the  user.  The  reader  is 
warned  not  to  form  an  opinion  of  the  system’s  capabilities  on  the  basis  of  this  small 
introductory  example  alone,  a  variety  of  more  interesting  programs  have  been  handled 
by  the  system.  Some  of  them  can  be  found  in  section  7  of  this  paper  and  in  [2], 

This  paper  is  divided  into  nine  sections  and  two  appendices.  Section  2  contains 
important  definitions,  particularly  the  definitions  of  the  language  and  notation  of  the 
extended  semantics.  Section  3  is  mainly  concerned  with  the  predicate  DEF,  which  is 
true  of  expressions  having  a  well  defined  value.  Section  4  presents  some  of  the  basic 
inference  rules  of  the  extended  semantics.  Section  5  presents  a  precise  axiomatic 
definition  of  the  evaluation  of  expressions  in  Pascal.  In  section  6,  the  definition  of 
expression  evaluation  is  used  as  the  basis  of  a  definition  of  Pascal  statements,  functions, 
and  procedures.  Section  7  develops  some  properties  of  the  extended  definition  that 
are  valuable  when  verifying  actual  programs.  Section  8  discusses  some 
generalizations  of  the  extended  definition,  including  a  new  method  of  verifying 
programs  with  procedure  parameters.  Following  this  is  a  discussion  of  our  general 
conclusions.  Finally,  Appendix  A  gives  details  of  the  implementation  of  the  extended 
semantics  in  Runcheck,  based  on  the  principles  developed  in  section  7,  and  Appendix 
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B  discusses  the  details  of  a  definition  of  simultaneous  substitution  for  disjoint  Pascal 
variables. 


2.  Preliminaries 


2.1  General  definitions 

#T  reference  class  (see  [11]),  used  to  represent  the  set  of  values  of  a 

dereferenced  pointer  of  type  IT. 

#TcPo  value  of  the  variable  Pt  where  P  has  type  tT.  Throughout  this  paper,  first 

order  language  terms  of  the  form  RePo  will  denote  Pascal  expressions  of  the  form  Pt. 
Any  Pascal  expression  involving  pointers  can  be  translated  into  this  notation,  provided 
that  the  types  of  the  pointer  variables  have  been  specified.  For  further  details,  refer  to 
[11]. 

POINTERSTO(#T)  set  of  all  pointer  values  of  type  IT. 

<A,  [I],  E>  value  of  the  array  A  after  assigning  the  value  E  In  the  Ith  position. 

<R,  .F,  E>  value  of  R  after  R.F:=E. 

<#T,  cPa,  E>  value  of  #T  after  Pt:*E,  where  P  has  type  tT. 


Functions  mapping  Pascal  expressions  into  types: 

type(E)  the  type  of  an  expression  E. 
indextype(A)  value  is  R  If  A  has  type  ARRAY[RJ  OF  S. 


Phrases  used  la  a  special  sense: 

The  phrase  simple  variable  is  synonymous  with  both  variable  identifier  and  declared 
variable. 

A  selected  variable  is  a  component  of  a  variable  identifier  (e£.  A[I]  is  a  selected 
variable). 

A  Pascal  variable  is  either  a  variable  identifier  or  a  selected  variable  [9]. 


Notation  for  Substitution 
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2.2  Notation  for  Substitution 

Simultaneous  Substitution  for  Identifiers. 


If  P(X,  Y)  is  a  formula  where  X  ■  [xl,  —  ,xn]  and  Y  *  [yl,  . . .  ,ym]  are  ordered  sets  of 

free  variable  identifiers,  then  P(A,  B),  where  A  ■  [al . an]  and  B  =  [bl, . . .  ,bm]  are 

ordered  sets  of  terms,  stands  for  the  result  of  simultaneously  substituting  the  ai  for  the 
xi  and  the  bj  for  the  yj  in  P. 


If  the  set  X  of  free  variable  identifiers  of  a  formula  P(X)  is  partitioned  into  subsets  XI 
and  X2,  then  P(X1,  X2)  stands  for  P(X),  and  P(A1,  A2),  where  Al  and  A2  are  ordered 
sets  of  terms,  stands  for  the  result  of  simultaneously  substituting  in  P  the  terms  in  Al 
for  the  variables  XI  and  the  terms  in  A2  for  the  variables  X2. 

Substitution  for  a  Pascal  Variable. 

pj^  where  v  is  any  term  denoting  a  Pascal  variable,  is  defined  recursively  as  follows. 


where  x  is  an  Identifier,  stands  for  P  with  t  substituted  for  x. 


p|VtC,]a 


P 


v 

<v,£i],t> 


P 


v 

<v,.f,t> 


p 


vcpa 

t 


=  Pi 


v 

<V,CP3,t> 


2.3  Disjoint  Pascal  Variables 

Intuitively,  two  Pascal  variables  are  disjoint  iff  an  assignment  to  one  of  them  cannot 
affect  the  value  of  the  other.  It  is  obvious  that  in  languages  with  array  subscripting 
and  pointers,  disjointness  is  a  dynamic  property  —  it  depends  on  the  values  of  variables. 
For  instance,  A[i]  and  A[j]  are  disjoint  iff  i**j. 

If  vl,...,vn  are  disjoint  Pascal  variables,  it  is  possible  to  define  the  simultaneous 
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substitution 


vn 

tn 


of  n  expressions  for  n  Pascal  variables,  in  terms  of  the  sequential  substitutions  defined 
above  in  2-2-  This  definition  and  the  formal  definition  of  disjointness  are  needed  only 
for  the  procedure  call  rules;  details  are  presented  in  Appendix  B. 


2.4  Formulas  in  the  extended  semantics 

The  syntax  of  formulas  is  ordinary,  and  is  included  here  mainly  for  reference.  A 
formulal  is  a  pure  first  order  formula.  The  syntactic  category  of  program  statements 
includes  all  executable  Pascal  statements  plus  some  additional  statements  which  are  used 
only  at  intermediate  steps  during  proofs.  The  new  statement  types,  known  as  evaluation 
statements  and  assume  statements,  do  not  initially  appear  in  programs,  but  can  be 
introduced  by  certain  rules  during  the  course  of  a  proof.  Evaluation  statements 
correspond  to  the  action  of  evaluating  an  expression  or  computing  the  location  of  a 
variable  Assume  statements  are  used  by  some  of  the  proof  rules  to  record  previously 
justified  logical  assumptions  at  points  within  the  body  of  an  executable  program. 

Implicitly  associated  with  each  formula  is  a  set  of  declarations  of  constants,  variables, 
types,  and  defined  procedures  and  functions,  corresponding  to  a  static  scope  in  a 
program.  The  syntactic  distinction  between  declared  and  undeclared  symbols  is  made 
with  respect  to  the  scope.  It  is  assumed  that  all  name  conflicts  in  the  scope  are  removed 
by  renaming. 

<var1ab<e>::=  <declared  variable>  |  Undeclared  variable> 

<op>::»  <Paacal  built  In  functlon> 

|  < declared  function  sign> 
j  Undeclared  function  slgn> 

<torm>::«  <op>  «termlist»  |  <varlable>  |  <constant> 
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<termllet>::»  [(term>  [,  <term>]*] 

(predicated:*  (declared  boolean  function  algn> 

|  (Pascal  built  in  predicate  (-,  <,  s)> 

j  (undeclared  predicate  sign> 

(atomlc>::=  (predlcate>  ((termlist»  |  True  |  False 

(formula1>::=  (formula1>  (logical  connective>  (formula1>  |  -  (formula1> 
|  V  (undeclared  varlable>  (formula  1  > 
j  (atomic> 

(statements  :*  (Pascal  executable  statement> 

|  (assume  statement> 
j  (evaluation  statement> 
j  (statement >;  (statement> 

(assume  statement>::=  ASSUME  (formula1> 

(evaluation  statement>::=  Eval  (Pascal  expression> 

|  Locate  (Pascal  varlable> 

(subprogram  declaration>::=  (Pascal  function  declaratlon> 

|  (Pascal  procedure  declaratk>n> 

(formula  of  unextended  definitions  :*  (formula1> 

|  (formula  1  >  {(statement>}  (formula  1  > 
j  (formula1>  {(subprogram  declaration>}  (formula1> 

(formulas  :=  (formula  1> 

|  (formula  1>  |[(statement>]]  (formula 1> 
j  (formula  1  >  [(subprogram  declaration^  (formula  1  > 


Throughout  the  paper,  we  will  distinguish  between  the  type  of  an  expression  and  its 
sort  in  the  many  sorted  first  order  language.  By  the  type  of  an  expression,  we  mean  its 
Pascal  type  according  to  the  scope.  By  the  sort  of  an  expression,  we  mean  its  sort  in  the 
first  order  language.  Except  for  subranges,  the  sort  of  an  expression  is  the  same  as  its 
type.  Integer  and  integer  subrange  expressions  are  of  sort  integer.  Similarly,  expressions 
whose  type  is  a  subrange  of  an  enumerated  type  have  the  same  sort  as  the  enumeration. 
A  sort  will  be  said  to  cover  both  the  type  with  the  same  name  and  all  subranges  of  the 
type. 
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To  be  well  formed,  a  statement  must  satisfy  the  syntax  and  type  requirements  of  the 
programming  language  [9].  Because  of  the  correspondence  between  types  and  sorts,  an 
expression  satisfies  the  type  requirements  of  the  programming  language  iff  it  is  a  well 
formed  term  according  to  the  sorts.  A  formula!  is  a  first  order  formula  which  may 
contain  free  occurrences  of  declared  and  undeclared  variables.  Each  term  or  atomic 
formula  whose  outer  sign  is  declared  or  Pascal  predefined,  must  also  satisfy  the  type 
requirements  of  the  programming  language. 


2.6  Notation  for  the  extended  semantics 

The  axioms  and  inference  rules  in  the  extended  semantic  definition  are  actually  schemes, 
or  infinite  sets  of  axioms  and  rules.  In  this  respect,  our  system  is  no  different  from 
previous  axiomatic  definitions.  When  a  scheme  is  applied,  information  from  the 
program  scope  must  be  substituted  in  certain  places.  To  specify  the  information  that  is 
to  be  substituted,  we  use  a  meta  notation.  An  expression  involving  a  function  or 
predicate  sign  in  Bold  Italics  indicates  a  term  or  formula  to  be  substituted.  Instances  of 
the  axiom  or  rule  are  formed  by  evaluating  the  italicized  meta  expression  to  produce  a 
term  or  formula.  For  example,  the  rule  for  assignment  to  a  whole  variable  is; 

P  |[Eval  yH  Inrangaiy,  typaix))  a  q|* 

P  ffx  :=  yj  Q 

Consider  a  typical  context: 

TYPE  8*1. .600; 

VAR  g:s;  hrINTEGER; 

g  :*  h+4; 


Since  g  is  a  subrange  variable,  the  assignment  statement  will  cause  a  subrange  error 
unless  h+4  is  in  the  correct  range.  Inrangaiy,  typaix))  is  the  notation  for  a  formula 
which  asserts  that  the  value  of  y  is  in  the  range  of  the  variable  x.  In  the  example 
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context,  the  desired  instance  of  the  rule  is: 
P  |[Eval  h+4]]  1sh+4  a  h+4s600  a  ojj 
P  |[g  :=  h+4j  Q 


Q 

h*4 


2.6  Formula  Constructing  Functions 

Inrsn<e«expreagion>>  <type» 

Inrange  is  a  function  mapping  <expression>  x  <type>  -»  <formulal>.  The  expression 
must  be  of  a  sort  which  covers  the  type. 

if  type  is  a  subrange  a..b, 

Inrange( expression,  type)  ->  a^expresslon  a  expression^, 
otherwise, 

Inrange(expression,  type)  -»  TRUE. 

Disjoint! < Pascal  variable >,  <Pascal  variable» 

The  function  Disjoint  maps  a  pair  of  Pascal  variables  into  a  formulal  which  is  true  iff 
the  variables  are  disjoint.  Refer  to  Appendix  B  for  a  detailed  definition  of  Disjoint. 

Dls  jolnt-Mt(<set  of  Pascal  varlables» 

For  any  finite  set  of  Pascal  variables,  Disjoint-set  constructs  a  formulal  which  is  true  iff 
all  pairs  of  variables  in  the  set  are  disjoint. 


3.  Theory  of  Definedness:  The  Predicate  DEF 

In  order  to  introduce  the  possibility  that  a  program  variable  can  be  uninitialized,  we 
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assume  the  existence  of  an  uninitialized  scalar  value,  Q.  The  value  of  a  newly  created 
program  variable  is  unspecified.  (This  is  explained  more  fully  in  section  6.S.)  Before 
the  program  can  use  the  value  of  a  variable,  it  must  assign  the  variable  a  fully 
initialized  value  one  such  that  none  of  its  components  is  equal  to  fl.  The  predicate  DEF 
will  be  true  only  of  these  fully  initialized  values. 

In  the  intended  model  of  the  first  order  theory  of  DEF,  terms  of  a  simple  sort  range 
over  a  universe  of  values  including  fi.  V  alues  of  compound  sorts  are  built  up  by  using 
the  sets  of  simple  values  as  components.  For  example,  the  possible  values  of  a  variable 
of  sort  ARRAYCs]  OF  INTEGER  include  arrays  with  some  positions  equal  to  Q. 

Axioms  DEF1-DEF8  below  describe  the  properties  of  DEF  and  of  Pascal  types. 

DEF1)  for  every  constant  c,  DEF(c)  is  an  axiom. 

0EF2)  if  e  is  of  an  enumerated  sort  (cl, . . .  ,cn), 

DEF(e)  3  e=c1v  . .  .  ve=cn. 

OEF3a)  if  x  is  an  expression  of  sort  ARRAY[a..b]  OF  t, 

DEF(x)  a  (VI  a^iAfeb  3  DEF(x[i]». 

DEF 3b)  if  r  is  of  a  Pascal  record  sort,  and  f  1 , . .  .  ,fn  are  the  record  field  names, 

DEF(r)  ■  DEF(r.f  1  )a  . . .  ADEF(r.fn). 

DEF3c)  if  #t  is  of  a  reference  class  sort, 

DEF(#t)  ■  (Vp  <  POINTERSTO(#t))  (paNIL  3  DEF(#tcP3)). 

DEF4)  DEF(a)ADEF(b)  3  DEF(a  9  b) 

Where  ®  Is  an  operator  in  {+,  -,  *,  a,  a,  <,  i,  AND,  OR,  NOT} 

DEF6)  DEF(aMDEF(b)AbaO  3  DEF(a/b)ADEF(a  DIV  b)ADEF(a  MOD  b) 

Axiom  DEF6  defines  equality  for  compound  types: 

DEF6a)  if  x  and  y  are  expressions  of  a  record  sort,  and  f  1 , . . .  ,fn  are  the  field  names, 
x*y  a  (x.f  1  ay.f i  a  ...  a  x.fn«y.fn). 

DEFOb)  if  x  and  y  are  expressions  of  sort  ARRAY[a..b]  OF  t, 
x*y  a  (Vi  asisb  3  x[l]«y(l]). 


The  following  two  axioms  are  not  normally  needed  for  proving  absence  of  runtime 
errors  in  programs,  but  are  included  for  thoroughness: 
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DEF7)  for  each  sort3  s,  (3Xg  -<DEF(X8))  is  an  axiom,  where  Xa  is  a  variable  of  sort  s. 

Axiom  DEF8  states  that  the  result  of  selecting  a  component  of  an  array  or  reference 
class  using  an  undefined  or  out  of  range  index  is  not  DEF. 

DEF8a)  if  x  is  of  sort  ARRAY[a..b]  of  t, 

DEF(x[i])  3  asiAfcb. 

DEF8b)  If  #t  is  of  a  reference  class  sort, 

0EF(#tcp3)  a  DEF(p)ap»NIL. 

The  resulting  theory  of  DEF  is  still  not  logically  complete,  e.g.  because  it  does  not  say 
much  about  the  undefined  values.  But  we  should  not  expect  to  find  such  details  in  a 
programming  language  definition.  All  of  the  properties  needed  for  proving  absence  of 
errors  in  programs  have  been  included. 


3.1  Consistency  of  the  theories  of  DEF  and  datatypes. 

Each  sort  has  some  standard  properties  which  must  be  included  in  the  complete  logical 
system.  Proofs  involving  the  integer  sort  appeal  to  the  usual  properties  of  integers  etc 
In  the  extended  semantics,  each  sort  ranges  over  a  universe  including  some  uninitialized 
values.  This  section  is  concerned  with  the  question  of  how  the  presence  of  uninitialized 
values  affects  the  theories  of  the  sorts.  One  problem  that  could  potentially  arise  is  that 
the  standard  properties  associated  with  a  sort  could  imply  that  all  its  elements  are  DEF, 
contradicting  axiom  DEF7. 

Consider  the  conjunction  of  axioms  DEF  I  and  DEF7.  Axiom  DEF  I  says  that  every 
constant  symbol  in  the  language  corresponds  to  an  initialized  value  Axiom  DEF7 
asserts  that  there  are  values  for  which  DEF  is  false  Obviously,  these  values  cannot  be 
named  constants  or  terms  built  from  constants.  This  raises  the  questions  of  consistency 
and  of  what  the  models  of  the  sorts  are  like  In  the  extended  semantics,  each  sort  must 
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have  a  theory  whose  models  contain  at  least  one  unnamed  element.  This  requirement  is 
easily  satisfied,  but  it  must  be  taken  into  account  in  choosing  axioms  for  each  sort.  For 
instance,  axiom  DEF2  permits  the  models  of  enumerated  sorts  to  contain  extra  elements 
which  are  not  DEF.  Consequently,  all  finite  simple  and  compound  sorts  have  extra 
elements  that  are  not  DEF. 

The  extended  semantics  is  intended  to  be  used  with  a  "standard'1  theory  of  the  integers, 
and  with  standard  theories  of  data  structures  with  the  selection  and  assignment 
operations  [11].  Each  of  these  theories  has  a  standard  model  containing  only  the  values 
for  which  DEF  is  true  in  the  extended  semantics.  It  would  be  possible  to  assure  the 
consistency  of  the  combined  theories  by  restricting  the  axiom  at izat ion  of  data  structures 
to  values  for  which  DEF  is  true  Under  this  approach,  if  Vx  P(x),  is  a  standard  axiom 
for  a  certain  sort,  then  Vx  DEF(x)=>P(x),  would  be  chosen  as  the  corresponding  axiom  in 
the  extended  semantics.  The  obvious  disadvantages  of  this  approach  are  that  the 
axioms  are  more  complicated  and  proofs  would  have  to  establish  the  truth  of  DEF  for 
every  term  in  order  to  apply  sort  axioms.  We  would  like  the  extended  semantics  to 
have  the  same  sort  axioms  as  the  ordinary  system,  so  we  choose  to  use  the  standard 
axioms  of  data  structures  and  to  take  advantage  of  the  existence  of  nonstandard  models. 
For  instance,  since  all  of  the  standard  integers  have  constant  symbols,  the  models  of  our 
integer  sort  under  the  DEF  axioms  are  the  nonstandard  models  of  arithmetic  —  models 
with  extra  elements.  There  is  only  one  point  that  requires  some  care,  and  that  is 
combining  the  theories  of  DEF  and  arithmetic.  The  "standard"  theory  of  arithmetic 
must  not  contain  the  symbol  DEF.  If  an  axiom  system  for  arithmetic  is  used,  it  must  not 
contain  DEF.  For  example,  if  the  axiom  system  has  an  induction  schema,  instances 
involving  DEF  cannot  be  used.  Without  this  precaution,  the  axioms  would  give  a 
contradiction.  Suppose  that  the  induction  scheme  for  integers  is 

#(0)  a  (Vn  #(n)  s  *(n-1  )a*(o+1  ))  3  (Vx  t(x)).  (P) 

Then  from  DEF(O)  and  DEF(n)  3  DEF(n- 1  )ADEF(n+ 1)  one  could  deduce  Vx  DEF(x), 
which  contradicts  axiom  DEFT 
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Another  approach  is  to  use  a  special  axiom  at  ization  of  arithmetic  that  allows  instances 
with  DEF.  One  such  scheme  for  induction  on  the  integer  sort  is: 

•(0)  a  (Vn  *(n)  a  ®(n-1  )A*(n+1 ))  a  (Vx  DEF(x)  =  4(x)).  (S-P) 


3.2  Tbe  relattonahip  between  DEF  and  Inrange 

In  Pascal,  every  subrange  type  is  bounded  by  two  constants,4  a_b.  Thus  according  to  the 
definition  of  Inrange,  Inrange(x,  s)  implies  DEF(x),  if  s  is  a  subrange.  This  follows  from 
the  properties  of  the  £  ordering  of  the  integers.  For  example,  it  is  a  theorem  in  the 
theories  of  integer  ordering  and  DEF  that 

Vx  (Isx  a  xs4)  a  DEF(x), 

because  the  standard  properties  of  integer  ordering  imply  that 

Vx  (l£x  a  XS4)  3  (x*1  v  x*2  v  x»3  v  x=4) 

and  each  of  these  constants  is  DEF.  Note,  however,  that 

VxVyVz  (DEF(x)  a  DEF(z)  a  xsy  a  ysz)  3  DEF(y)  (3.1 ) 

is  not  a  theorem  about  DEF,  because  it  cannot  be  proven  from  S-P,  the  special  form  of 
induction  on  the  integers.  Indeed,  there  are  nonstandard  interpretations  of  the  theories 
of  DEF  and  integers  for  which  formula  S.1  is  not  satisfied. 

Also  note  that  it  is  not  necessary  for  a  variable  to  be  Inrange  if  it  is  DEF:  under  the 
axioms  of  DEF,  there  can  be  a  variable  of  a  declared  subrange  type,  whose  value  is  both 
DEF  and  not  Inrange.  In  the  definition  of  P  I[A]J  Q,  no  program  is  permitted  to  assign  a 
value  to  a  subrange  variable  unless  the  value  is  Inrange.  If  P  ([A]  Q  holds,  a  subrange 
variable  can  only  be  out  of  bounds  before  it  has  been  assigned  a  value 
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4.  Fundamental  inference  rules. 

The  following  two  rules  are  included  in  both  the  unextended  and  extended  definitions: 

Concatenation  of  programs.  (CONCAT) 

P{A}Q,  Q  {B>  R  P  lA]]  Q,  Q[[B]]  R 

P  {A;  B>  R  P  |[A;  B]J  R 


Consequence  rule.  (CONSEQ) 

PsQ,  Q  {A>  R,  RsS  P=>Q,  Q  |[A]|  R,  R=>S 

P  {A>  S  P  |[A]|  S 

These  rules  will  be  used  implicitly,  beginning  in  the  next  section  on  the  semantics  of 
expression  evaluation.  Later,  after  P  [[A]]  Q  has  been  defined,  we  will  develop  its  logical 
relationship  to  P  {A}  Q  in  more  detail. 


6.  Expression  Evaluation. 

This  section  introduces  and  defines  evaluation  statements.  Evaluation  statements  have 
the  forms 

Eval  <Pascal  expression) 

Locate  <Pascal  variable) 

and  in  the  extended  semantics,  they  can  be  combined  with  Pascal  statements  and 
assertion  statements  to  form  the  general  statements  which  appear  inside  brackets  in  a 
formula  P[AjQ.  Evaluation  statements  will  be  used  in  section  6  to  define  the 
conditions  for  error  free  execution  of  Pascal  statements  containing  expressions  and 
variables. 

The  statement  Eval  E,  corresponds  to  the  action  of  evaluating  the  expression  E,  which 
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may  not  have  side  effects.  P  ([Eval  E]J  Q  is  defined  to  mean  that  if  P  holds,  then  E 
evaluates  without  runtime  error,  and  if  E  terminates  then  Q,wil1  hold.  Since  E  does  not 
have  side  effects,  P  and  Q,  refer  to  states  with  the  same  values  for  variables.  By  having 
two  assertions,  it  is  possible  to  make  partial  correctness  statements  about  function  calls. 
For  instance,  if  f  is  a  (strictly)  partial  function, 

P(x)  [[Eval  f(x)]J  Q(x,  f(x)) 

may  be  a  provably  true  statement  about  the  evaluation  of  f(x),  while  the  pure  first  order 
statement 

P(x)  a  Q(x,  f(x)) 

would  not  be  true  since  it  does  not  account  for  divergence  of  f(x). 

The  other  form  of  evaluation  statement,  Locate  V,  corresponds  to  the  action  of 
computing  the  location  of  a  variable.  The  difference  between  this  and  evaluating  a 
variable  is  that  to  compute  a  location,  all  of  the  subscripts  must  be  evaluated  and  all 
dereferenced  pointers  must  be  evaluated,  but  the  variable  itself  need  not  have  a  value. 
For  instance,  to  execute  the  assignment  statement  A[j]:«e,  the  subscript  j  must  have  a 
value  in  the  correct  range,  but  the  left  hand  side  A[j]  is  not  required  to  have  a  value. 
The  definition  of  A[j]:-e  is  expressed  in  terms  of  Locate  A[j],  and  Eval  e,  since  the 
right  hand  side  must  yield  a  value.  The  formula  P  ([Locate  V]]  Q  is  defined  to  mean 
that  if  P  is  true,  then  the  location  of  V  can  be  computed  without  execution  errors,  and  if 
the  computation  terminates,  Q,wi11  hold. 

The  exact  meaning  of  expression  evaluation  is  often  a  point  of  confusion  in 
programming  languages  and  definitions.  The  definitions  presented  here  assume  that 
sufficient  restrictions  are  used  to  prevent  side  effects.  Pascal  [9]  assumes  a  fixed  order 
of  evaluation  within  statements  and  expressions,  so  the  final  value  of  an  expression  is 
well  determined  even  in  the  presence  of  side  effects.  It  is  a  simple  matter  to  replace  a 
function  definition  which  has  side  effects  by  an  equivalent  procedure  definition,  by 
adding  a  new  VAR  parameter  to  return  the  function  value.  Thus  it  is  possible  to 
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rewrite  a  Pascal  program  in  which  functions  have  side  effects  into  an  equivalent 
program  in  which  function  calls  are  replaced  by  procedure  calls  and  all  expressions  are 
free  of  side  effects.  This  transformation  would  convert  the  evaluation  of  an  expression 
with  side  effects  into  a  sequence  of  procedure  calls  involving  some  new  variables  to 
store  temporary  values.  Since  this  transformation  can  be  easily  mechanized,  our  Pascal 
semantics  are  indirectly  applicable  even  to  programs  with  function  side  effects. 


If  runtime  errors  are  not  being  considered,  as  in  the  original  Hoare  axiom  system, 
function  calls  without  side  effects  can  be  defined  by  the  following  rule, 


If(X1 . Xn,G)  {Function  f(X1  :t1 ;  .  . .  ;Xn:tn):tf;  B>  0f(X1,  . .  .  ,Xn,G), 

P  {Eval  A1 ;  .  .  .  ;Eval  An)  If(A1,  .  .  .  ,An,G)  a  (Of(A1 . An,G)  =  Q) 


P  {Eval  f(Al, .  .  .  .An)}  Q 


(FI) 


which  states  that  evaluation  of  f(Al,...,An)  can  be  reduced  to  the  evaluation  of 
Al, .  .  .  An  in  order,  followed  by  the  application  of  f,  if  If  and  Of  are  shown  to  be 
valid  entry  and  exit  assertions  for  f.  G  is  the  set  of  read  only  global  variables,  and  B  is 
the  body  of  the  function  f. 


A  fine  point  to  be  considered  at  the  practical  level  is  that  some  compilers  change  the 
order  of  evaluation  within  expressions  if  there  are  no  side  effects.  If  the  evaluation  of 
an  expression  terminates,  it  terminates  with  the  same  result  under  all  orderings.  Since 
the  truth  of  P  {Eval  E)  Q  depends  only  on  whether  evaluation  of  E  terminates  and  the 
value  of  each  subexpression,  all  orders  of  evaluation  are  equivalent  with  respect  to 
P  {Eval  E}  a  The  truth  of  P  {Eval  E}  Q  can  be  determined  by  choosing  any  possible 
ordering  and  considering  whether  it  is  true  for  that  ordering.  Rule  FI  above,  depends 
on  choosing  one  ordering.  Thus  F I  is  correct  even  if  there  is  reordering. 

The  situation  is  different  when  proving  absence  of  runtime  errors.  Then,  different 
possible  orders  of  evaluation  must  be  considered  separately.  For  instance,  an  expression 
such  as  f(x)+>a(i]  might  have  a  runtime  error  if  i  is  out  of  range.  If  f(x)  is  evaluated 
first  and  does  not  terminate,  the  error  cannot  occur.  But  if  the  order  is  changed  and  a{i] 
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is  evaluated  first,  the  error  could  occur.  Since  different  orders  of  evaluation  can  give 
different  results,  we  define  P  ([Eval  E]]  Q  to  be  true  iff  every  order  of  execution  is  error 
free  and  Qwill  hold  after  every  terminating  execution. 

Another  complication  is  the  possibility  of  short  circuit  evaluation  in  Boolean  expressions. 
In  evaluating  an  expression  such  as  r  AND  s,  when  the  value  of  r  is  False,  Pascal  permits 
compilers  to  omit  the  evaluation  of  s.  The  expression  r  AND  s  is  assumed  to  have  the 
value  False  because  r  is  False.  Observe  that  if  s  does  not  terminate  or  if  it  has  a 
runtime  error,  the  short  circuit  has  a  different  partial  correctness  semantics  from  full 
evaluation.  For  example, 

P  ([Eval  r  AND  s]]  False 

may  be  true  for  full  evaluation  but  not  for  short  circuit.  Short  circuit  evaluation  is 
really  a  form  of  branching  within  expressions.  The  axiomatic  definition  assumes  that 
full  evaluation  is  used.  Some  languages,  such  as  ADA,  permit  short  circuit  evaluation  in 
certain  contexts  but  require  the  user  to  explicitly  request  it.  This  seems  to  be  a  cleaner 
approach,  and  we  show  below  (rule  ESS)  how  it  can  be  formalized  in  the  extended 
semantics. 

In  summary,  our  detailed  semantic  definition  of  Pascal  statements  will  be  based  on 
partial  correctness  assertions  about  evaluation  of  expressions  and  variables.  It  is  argued 
that  even  in  the  absence  of  side  effects,  the  definition  of  expression  evaluation  should  as 
a  practical  matter  account  for  possible  variations  in  the  order  of  evaluation.  We  will 
give  an  axiomatic  definition  that  does  not  assume  any  fixed  ordering.  On  the  other 
hand,  function  call  rule  FI  can  be  used  if  evaluation  order  is  fixed,  or  if  runtime  errors 
are  not  considered. 


The  rules  defining  P  ([Eval  e]|  Q  are  as  follows: 
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Expression  evaluation. 

P  [Locate  V]]  DEF(V)  a  Q 
P  lEvat  V3  Q 

(V  is  any  Pascal  variable.) 

P  [Eval  Aj  Q 
P  [Eval  (9  A)J  Q 

(where  9  is  one  of  the  monadic  operators,  +,  NOT) 


(El) 


(E2) 


The  following  rule  for  evaluation  of  an  operator  expression  contains  three  conditions. 
The  first  two  assert  that  A  and  B  evaluate  without  runtime  error  if  P  holds.  These 
conditions  make  the  rule  independent  of  any  fixed  order  of  evaluation,  by  requiring 
either  operand  to  evaluate  correctly  if  evaluated  first.  The  third  condition  states  that 
after  both  operands  have  been  evaluated,  must  hold.  Since  there  are  no  side  effects 
and  the  first  two  conditions  have  established  that  the  operands  evaluate  without  errors, 
the  order  in  the  third  condition  is  not  significant.  Notice,  though,  that  the  first 
condition  is  redundant  because  the  third  one  also  requires  A  to  evaluate  safely.  In 
stating  the  rest  of  the  rules,  we  will  omit  redundant  conditions  such  as  this. 

P  [Eval  A]]  True, 

P  [Eval  B]J  True, 

P  [Eval  A;  Eval  B]]  Q 

-  (E3) 

P  [Eval  A®B]J  Q 

(where  9  la  a  relation  sign  or  boolean  connective.) 

Rule  ESS  formalizes  evaluation  of  ADA  conditions.  In  ADA  the  boolean  conditions  for 
controlling  IF  and  WHILE  statements  etc  can  have  one  of  the  forms 

<expreasion>  AND  THEN  <expresslon> 

<expression>  OR  ELSE  <expresslon> 

which  indicate  that  the  left  hand  expression  is  to  be  evaluated  first,  after  which  the 
right  hand  expression  will  be  evaluated  only  if  its  value  is  needed  to  determine  the 
value  of  the  condition.  The  following  rule  for  evaluation  of  A  AND  THEN  B  states  that  it 
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must  always  be  possible  to  evaluate  A,  and  that  1)  if  A  is  false,  Q,must  hold,  and  2)  if 
A  is  true,  it  must  be  possible  to  evaluate  B,  after  which  Q,  must  hold. 

P  ([Eval  A]J  ->A  a  Q, 

P  ([Eval  A;  ASSUME  A;  Eval  Bj  Q 

-  (E3S) 

P  [[Eval  A  AND  THEN  Bj  Q 


Maxint  is  an  undeclared  integer  variable  representing  the  range  on  which  integer 
arithmetic  operators  do  not  overflow.  The  axiomatic  definition  makes  no  assumption 
about  the  values  of  Maxint.  In  order  to  prove  absence  of  overflow,  the  user  must 
supply  assertions  relating  Maxint  to  the  computations  in  the  program. 

P  ([Eval  B]j  True, 

P  ([Eval  A;  Eval  B](  -MAXINTsAfflBsMAXINT  a  Q 

-  (E4) 

P  ([Eval  A®B]]  Q 

(where  $  is  one  of  the  arithmetic  operators,  +,  ~,  *) 

P  ([Eval  B]]  True, 

P  ([Eval  A;  Eval  B]]  B*0  a  Q 

-  (E5) 

P  ([Eval  A®B]|  Q 

(where  ®  is  DIV,  MOD,  or  /) 


Maxint  can  have  any  value  such  that  integer  arithmetic  does  not  overflow  in  the  range 
-Maxint .  .  Maxint  Note  that  many  computers  use  twos  complement  arithmetic,  in  which 
the  smallest  negative  integer  has  an  absolute  value  one  greater  than  the  largest  positive 
integer.  This  situation  (and  other  possible  number  systems  with  asymmetrical  ranges) 
can  be  more  accurately  modeled  by  introducing  a  separate  variable  Minint  to  stand  for 
the  smallest  integer,  and  making  the  obvious  changes  in  rules  E2,  E4,  and  E5. 


The  following  rule  defines  the  evaluation  of  a  function  call  f(A1, . . .  ,An),  where  each  of 
the  Ai  is  a  value  parameter  and  G  is  a  list  of  read  only  global  variables.  For  error  free 
evaluation  of  the  function  call,  each  of  the  AI  must  evaluate  and  yield  a  value  in  the 
proper  range.  The  second  the  third  premises  of  the  rule  state  that  if  If  and  Of  are 
valid  entry  and  exit  assertions  for  f,  then  they  can  be  used  to  show  P|[Eval  f(A)UQ.  If  the 


v**- 


22 


Expression  Evaluation. 


parameters  A  and  G  satisfy  the  entry  condition  If,  then  Of  will  hold  on  exit.  Also, 
f(A,G)  will  be  DEF  and  Inrange  —  these  properties  are  assured  by  the  declaration  rule. 


for  i=1,  . .  .  ,n,  P  {[Eval  AiJ  Inrangei Ai,  ti), 

If(X1,  .  .  .  ,Xn,G)  {Function  f(X1:tl;  .  .  .  :Xn:tn):tf;  B>  0f(X1 . Xn,G), 

P  ([Eval  AI ;  .  .  .  ;Eval  An]]  If(A,G)  a  (Of(A,G)  a  DEF(f(A,G))  a  Inrangei f(A,G),  tf)  =>  Q) 


P|[Evalf(A1 . An)]]  Q 


(E0) 


Location  Validity. 

P  ([Locate  Vj  P 

(this  is  an  axiom  for  any  declared  variable  identifier  V) 
P  ([Locate  R]]  Q 


P  [[Locate  R.F]]  Q 

(where  R  is  of  a  record  type  with  a  .F  field) 
P  ([Eval  Zl  Z*NIL  a  Q 


P  ([Locate  ZTj  Q 

(where  Z  is  of  a  pointer  type) 

P  ([Eval  Ij  True. 

P  ([Locate  A;  Eval  I]]  Inrangail,  indaxtype(b))  a  Q 


P  ([Locate  A[I]]]  Q 

(where  A  is  of  an  array  type) 


(LI) 


(L2) 


(L3) 


(L4) 


Example  2:  Show  Q  ([Eval  a[l]+pt]]  True,  where 

Q  ■  DEF(i)  a  ChsislOO  a  DEF(a[i])  a  0sa[i]s25  a  DEF(p)  a  p*NIL  a  pf=6  a  IOOOsMAXINT 

with  the  variable  declarations 
VAR  a:  ARRAY[0:100]  OF  INTEGER; 

VAR  i:  INTEGER; 

VAR  p:  tINTEGER; 

By  applying  the  inference  rules  in  reverse,  we  can  find  simpler  sufficient  conditions  for 
the  formula  to  be  true.  We  will  continue  to  work  backwards  until  we  reach  sufficient 
conditions  that  are  obviously  true  At  this  point,  the  formula  will  be  proven,  because  it 
will  be  possible  to  construct  a  formal  proof  by  starting  with  the  final  conditions  and 
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applying  the  inference  rules  until  the  original  formula  is  deduced.  The  first  step  is  to 
use  rule  E4  in  reverse,  reducing  the  problem  of  proving  a  statement  about  Eva!  a[i]+pt 
to  proving  statements  about  Eval  a[i]  and  Eval  pt. 

Q  ([Eval  pt]|  True,  (6.1 ) 

and  Q  ([Eval  a[i];  Eval  ptj  -MAXINT  S  a[l]+pT  s  MAXINT.  (6.2) 

Before  finishing  the  example,  we  pause  to  mention  a  fact  about  the  extended  semantics 
which  is  helpful  in  removing  redundancy  from  proofs.  Since  expressions  do  not  have 
side  effects,  we  can  assume  in  proofs  that  the  state  does  not  change  when  an  expression 
is  evaluated.  The  following  lemma  states  this  fact  in  a  useful  form. 

Lemma,  t-  P  ([Eval  e]J  True,  Iff  h  P  ([Eval  e]J  P. 

h  P  ([Locate  e]|  True,  Iff  I-  P  ([Locate  e]|  P. 

Another  point  about  redundancy  is  that  when  applying  the  inference  rules  directly  to 
prove  P  ([Eval  E]J  Q,  the  proof  of  error  free  execution  of  some  subexpressions  may  appear 
many  times.  A  mechanical  evaluator  of  the  preconditions  can  easily  take  the  repetition 
into  account  and  only  verify  each  subexpression  once 

Continuing  the  example,  show  6.1: 

Q|[Eval  ptj  True 

«-  Q ([Locate  pTj  DEF(pt)  (by  El) 

«-  Q  [[Eval  p](  pxNIL  a  DEF(pT)  (by  L3) 

-  Q ([Locate  p]]DEF(p)  a  pxNIL  a  DEF(oT)  (by  El) 

-  Q  3  (OEF(p)  a  p«NIL  a  DEF(pt))  (by  LI  and  CONSEQ) 

«-  True.  (by  definition  of  Q) 

Next,  show  Q  ([Eval  a[l]]|  True 

4-  Q ([Locate  aCiJ]  DEF(a[l3)  (by  El) 

4-  Q  |[Eval  ij  DEF(a[i]), 

and  Q|[Locate  A;  Eval  l]]  OsislOO  a  DEF(a£l])  (by  L4) 
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These  last  two  formulas  are  trivially  provable,  since  the  assertion  Q,  implies  that  i  has  a 
value,  and  the  whole  variable  A  is  always  a  valid  location  by  LI.  Having  shown  that 
both  a[i]  and  pt  evaluate  without  any  errors,  we  can  use  the  CONCAT  rule  to  infer  that 
one  can  be  evaluated  after  the  other,  i.e. 

Q  {Eval  a[l];  Eval  pt}  True  (by  CONCAT).  (6.3) 

It  only  remains  to  show  that  there  Is  no  overflow,  formula  6.2. 

Q  {Eval  a[l];  Eval  pt}  -MAXINT  s  a[i]+pt  s  MAXINT 

*-  Q  =  -MAXINT  £  a[i]+pt  s  MAXINT 

(by  CONSEQ  and  lemma  applied  to  6.3) 

«-  True. 


Example  3:  User  defined  partia.  i  unctions  in  expressions. 
VAR  x:  INTEGER; 

VAR  a:  ARRAY[0:100]  OF  BOOLEAN; 


FUNCTION  sqrt(n:  INTEGER):  INTEGER; 

ENTRY  True; 

EXIT  05sqrt<n 
BEGIN 

X  If  n  <  0,  then  loop  forever  without  execution  errors; 


otherwise,  set  sqrt «-  Integer  part  of  square  root  n. 
X 


END; 


Suppose  the  function  sqrt  has  been  defined  to  correctly  return  the  integer  square  root 
of  n  unless  n  is  negative,  In  which  case  it  loops  forever  without  runtime  errors.  Using 
the  function  declaration  rule  which  will  be  given  In  section  6.3,  it  Is  possible  to  prove 

True  ([Function  sqrt(n:INTEGER):INTEGER;  body]]  02sqrt(n)sn.  (6.4) 


The  entry  and  exit  specifications  of  sqrt  can  then  be  used  to  show  that  If  sqrt  Is  called 
with  an  argument  x  whose  value  is  less  than  100,  the  location  of  the  variable  a£sqrt(x)] 
can  be  computed  without  runtime  error. 
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OEF(x)  a  x*  100  {[Locate  a[sqrt(x)]J  True 

«-  DEF(x)  a  xslOO  {[Eval  sqrt(x)]]  True,  (6.6) 

and  DEF(x)  a  xslOO  {[Locate  a;  Eval  sqrt(x)]]  0*sqrt(x)£  1 00  (by  L4)  (6.6) 

Using  the  function  call  rule  E6,  the  first  part  6.6  reduces  to 
OEF(x)  a  xslOO  {[Eval  sqrt(x)]]  True 
«-  DEF(x)  a  xsl 00  {[Eval  x]J  True, 
and  True  {[Function  sqrt(n:INTEGER):  INTEGER;  body]]  Ossqrt(x)sx, 
and  DEF(x)  a  xsl  00  [[Eval  x]{  True  a  (Ossqrt(x)sx  a  DEF(sqrt(x))  =  True) 
which  are  all  true. 

The  second  part  6.6  can  be  simplified 

OEF(x)  a  xs  1 00  {[Locate  a;  Eval  sqrt(x)]]  0£sqrt(x)s100 

«-  DEF(x)  a  XS100  {[Eval  sqrt(x)]l  0ssqrt(x)s100  (by  LI  and  CONCAT) 

«-  OEF(x)  a  XS100  |[Eval  x]|  (Ossqrt(x)sx  a  DEF(sqrt(x))  =  Ossqrt(x)slOO) 

(by  E6) 

-  DEF(x)  a  xslOO 

{[Locate  x]{  DEF(x)  a  (OSsqrt(x)Sx  a  DEF(sqrt(x))  s  Ossqrt(x)s  1 00) 

(by  El) 

«-  DEF(x)  a  xSlOO  s  DEF(x)  a  (05sqrt(x)*x  a  DEF(sqrt(x))  s  Ossqrt(x)slOO) 

(by  LI  and  CONSEQ) 

*■  True 

6.  Extended  axiomatic  semantics  of  Pascal 

6.1  Assume  statements 

The  meaning  of  the  statement  ASSUME  L,  is  that  L  can  be  assumed  to  be  a  true  assertion 
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at  a  certain  point  in  a  program.  Assume  statements  do  not  initially  appear  in  programs, 
but  can  be  introduced  during  the  course  of  a  proof  to  record  logical  assumptions  which 
hold  at  points  within  a  program.  For  instance,  the  rule  for  IF  statements  reduces  a 
formula  involving  IF  L  THEN  SI  ELSE  S2  to  two  formulas  for  the  two  cases  of  the 
condition  L.  In  one  formula,  the  statement  ASSUME  L  records  the  assumption  that  L  was 
true,  and  in  the  other  formula,  ASSUME  -■L  records  the  assumption  that  L  was  false. 

(PaL)  a  Q 

-  (ASSUME) 

P  [ASSUME  L]|  Q 

6.2  Executable  statements 
Assignment  statements 

The  following  rule  applies  to  all  assignment  statements. 

P  [Eval  e]|  True, 

P  [Locate  pv;  Eval  e]]  Inrange (e,  fype(pv))  a  Q 

- if. -  (ASSIGN) 

P  [pv  :=  e]J  Q 

where  pv  is  any  Pascal  variable 

In  order  for  P  [pv  :=  ej  Q  to  hold,  it  is  necessary  for  the  assignment  to  execute  without 
any  runtime  errors,  and  for  Q,  to  be  true  in  the  updated  state.  The  rule  requires  the 
right  hand  side,  e,  to  evaluate  without  runtime  error  and  to  yield  an  initialized  value; 
the  location  calculation  for  left  hand  side  pv  is  also  required  to  be  free  from  errors.  If 
pv  is  a  subrange  variable,  the  Inrange  clause  requires  the  value  of  e  to  be  in  the  correct 
range.  The  updated  formula  Q,  is  formed  by  substituting  e  for  the  Pascal  variable  pv, 
using  the  definition  of  substitution  given  in  section  2.2. 
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IF  statements 

P  |[Eval  L;  ASSUME  L;  Si]]  Q, 
P  [[Eval  L;  ASSUME  -L;  S2]]  Q 

P  ffIF  L  THEN  SI  ELSE  S2J  Q 


1 
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CASE  statements 

for  1=1, . .  .  ,n,  P  ([Eval  X;  ASSUME  X=C,;  S,J  Q, 

P  [Eval  X]|  X*{C-j , . . .  ,Cn) 

-  (CASE) 

P  ([CASE  X  OF  C-j  :S-| ;  . .  .  jC^Sj)  Q 

The  Cj  are  lists  of  constants  for  each  branch  of  the  CASE  statement.  The  second 
condition  requires  the  CASE  expression  X  to  evaluate  to  one  of  the  constants  in  one  of 
the  Cj. 

NEW  procedure 

The  following  axiom  states  that  the  effect  of  the  Pascal  statement  NEW(x),  where  x  is  a 
variable  identifier  of  a  pointer  type,  is  to  change  the  value  of  x  to  a  new  pointer  value 
xo,  and  to  add  the  new  value  xo  to  the  reference  class. 

-«(x0  «  POINTERSTO(#T))  a  OEF(xO)  a  xONIL  a  QTl  I*  |[NEW(x)]|  Q  (NEW1) 

l#Tu  {xO}  1x0 

where  x  la  an  Identifier  of  type  IT  (pointer  to  object  of  type  T), 
xO  is  a  fresh  identifier  not  appearing  In  Q, 

#T  is  the  reference  class  for  type  T, 

#T  u  {xO}  stands  for  the  reference  class  after  adding  an  object  pointed  to  by  xO. 

The  antecedents  on  the  left  side  of  the  rule  state  that  1)  the  value  xO  generated  by  NEW 
is  a  new  pointer,  not  a  pointer  to  the  reference  class  #T,  2)  xO  has  an  initialized  value, 
and  3)  xO  is  not  the  pointer  NIL.  The  term  #T  u  {xO}  represents  the  new  reference  class 
after  the  dynamic  variable  xOT  has  been  allocated.  A  more  complete  discussion  of 
POINTERSTO  and  the  operation  of  adding  new  elements  to  a  reference  class  can  be  found 
in  [11]. 
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The  following  rule  reduces  a  NEW  statement  involving  a  selected  variable  to  a  NEW 
statement  with  an  argument  which  is  an  identifier. 


P  [NEW(SO);  S:*SO]|  Q 
P  |[NEW(S)3  Q 


(NEW2) 


where  SO  is  a  new  identifier  not  appearing  In  the  scope,  P,  or  Q. 
the  declaration  VAR  SO:  type( S)  is  added  to  the  scope. 


WHILE  statements 

P^I, 

I  |[Eval  B;  ASSUME  B;  Sj  I, 
I  |[Eval  Bj  -B  s  Q 


P  ([INVARIANT  I  WHILE  B  DO  Sj  Q 


(WHILED 


In  this  rule,  the  invariant  is  chosen  to  be  true  before  each  evaluation  of  the  While  test 
B.  The  rule  can  be  rearranged  to  correspond  to  other  choices  of  invariants. 


6.3  Functions  and  procedures 


6.3.1  Function  declaration 

With  the  function  declaration  rule,  one  can  infer  that  I  and  O  are  valid  entry  exit 
specifications  for  a  function  f,  if  for  inputs  satisfying  I,  the  body  of  the  function 
executes  without  runtime  errors  and  assigns  a  final  value  to  the  function  which  satisfies 
the  exit  assertion  O. 


Function  declaration 
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I(X1 ,  . .  .  ,Xn,G)  a  DEF(X1  )a  .  .  .  AOEF(Xn)  a  7/ira/?ge(X1  ,t1  )a  .  .  .  A/nrange(Xn,tn) 

[B]|  <Xf,X1 . Xn,G)  a  DEF(f)  a  Inr*ngm{ f,  tf) 

-  (TO) 

I(X1, . .  .  ,Xn,G)  ([Function  f(X1:t1;  . . .  ;Xn:tn):tf;  B]|  0(f(X1,  .  .  .  ,Xn),X1, . .  .  ,Xn,G) 

where  f  has  the  function  declaration 
FUNCTION  f(X1  :t1 ;  .  .  .  ;Xn:tn):tf; 

GLOBAL  G; 

ENTRY  I(X1, .  .  .  ,Xn,G); 

EXITO(f,X1,  . .  .  ,Xn,G); 

B; 


The  rule  requires  that  the  function  have  only  value  parameters  XI, . . .  ,Xn  and  a  set  of 
read  only  globals  G.  The  rule  assumes  that  each  of  the  value  parameters  has  an 
initialized  value  in  the  correct  range,  this  assumption  is  justified  by  the  call  rule,  which 
checks  the  actual  parameters.  If  global  variables  are  accessed,  the  entry  assertion  must 
assert  that  they  have  been  initialized. 


In  the  exit  assertion  0(f,Xl, . . .  ,Xn),  the  variable  f  stands  for  the  value  returned  by  the 
function.  The  rule  checks  that  the  body  assigns  f  a  value  in  the  correct  range.  As  we 
will  see  in  section  7.4,  the  condition  inrmngmtf,  tf)  appearing  after  execution  of  the 
body  is  redundant.  Because  the  declaration  rule  requires  f  to  be  DEF  after  execution  of 
the  body,  it  is  not  necessary  to  require  f  to  be  Inrange. 


6.3.2  Note  on  Global  Variables 

Rune  heck  requires  the  user  to  declare  lists  of  all  global  variables  that  could  potentially 
be  accessed  or  altered  by  each  subprogram.  The  system  checks  the  lists  by  a  syntactic 
examination  of  the  subprogram  body.  For  instance,  a  global  variable  g  which  is  used  in 
an  assignment  statement  g  :■  e,  must  be  declared  read  write.  Also,  if  the  body  of  p 
contains  calls  to  q,  then  all  globals  listed  for  q  must  be  listed  for  p. 

Reference  classes  are  a  special  case  of  global  variables  which  are  implicitly  accessed  or 
altered  although  they  do  not  appear  explicitly  in  the  executable  program  text.  If  a 
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subprogram  evaluates  pf,  this  is  considered  an  implicit  access  to  a  reference  class.  An 
assignment  pt  :■  e  is  considered  an  implicit  write  to  the  reference  class.  The  system 
requires  all  reference  classes  which  are  used  as  globals  of  a  subprogram  to  be  explicitly 
listed  by  the  user  as  global  parameters. 

The  presence  of  a  pointer  formal  parameter  does  not  necessarily  imply  that  a  reference 
class  will  be  accessed  or  altered  by  a  subprogram.  For  instance,  a  procedure  p  with  a 
VAR  formal  parameter  x  which  is  a  pointer  to  an  integer, 

TYPE  ptr  a  TINTEGER; 

PROCEDURE  p(VAR  X:  ptr); 

BEGIN  x  :*  NIL  END; 

may  assign  to  x  without  altering  the  reference  class  #INTEGER.  No  globals  would  be 
listed  for  this  procedure,  since  it  changes  only  the  pointer  x  and  not  any  of  the  integer 
variables  pointed  to. 

On  the  other  hand,  in  a  procedure  p2  which  assigns  to  xt,  it  would  be  necessary  to  list 
the  reference  class  #INTEGER  as  a  read  write  global, 

TYPE  ptr  =  TINTEGER; 

PROCEDURE  p2(VAR  x:  ptr); 

GLOBAL  (VAR  #INTEGER); 

BEGIN  xt  :*  0  END; 

because  an  integer  variable  accessed  by  a  pointer  is  changed. 

Observe  that  depending  on  the  actual  argument,  a  call  to  the  procedure  p  above  could 
have  the  effect  of  changing  a  reference  class,  as  in  the  call 

TYPE  ptr  *  TINTEGER; 

ptr2  ■  tptr; 

VAR  y:  ptr2; 

p(yT);  X  changes  #ptr  X 
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which  changes  the  reference  class  #ptr  of  variables  of  type  ptr  which  are  »rr**t*^  by 
pointers.  In  this  case  #ptr  is  not  considered  a  global,  although  the  call  rules  do  account 
for  the  fact  that  part  of  #ptr  is  altered  by  being  passed  as  a  VAR  parameter.  Which 
reference  class  is  altered  in  this  example  depends  on  the  call,  not  on  the  definition  of  p. 
For  example,  in  the  call 

TYPE  ptr  «  tINTEGER; 

ptrarray  *  ARRAY[1..100]  OF  ptr; 
ptrptrarray  ■  tptrarray; 

VAR  z:  ptrptrarray: 

p(zt[60]); 

z  is  a  pointer  to  variables  of  type  ptrarray,  zt  is  an  array  erf  pointer  variables,  and 
zt[50]  is  a  pointer  to  an  integer,  and  hence  the  correct  type  to  be  an  argument  to 
procedure  p.  The  variable  which  p  changes  in  this  case  is  an  element  of  an  array 
accessed  by  a  pointer,  and  this  causes  a  change  to  the  reference  class  #ptrarray. 

The  ability  of  a  procedure  with  a  VAR  pointer  parameter  to  change  different  reference 
classes  depending  on  the  actual  parameter,  is  exactly  analogous  to  the  ability  of  a 
procedure  with  a  VAR  integer  parameter  to  change  components  of  different  integer 
arrays. 

PROCEDURE  q(VAR  x:  INTEGER); 

BEGIN  x:=  0  END;  %  no  globala  % 

The  first  call  in 

TYPE  arr  *  ARRAY! 1.. 600]  OF  INTEGER; 

VAR  vl ,  v2:  arr; 

q(v1[60]); 

q(v2C60]); 


alters  part  of  vl,  but  the  second  one  alters  part  of  v2. 
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6.3.3  Procedure  declaration 


I(X,Y,G)  a  DEF(X1)a  . . .  ADEF(Xm)  a  Inrangm(X1  ,t1  )a  . . .  A/nrange(Xmttm) 

M  CKX.Y.G) 

-  (pD) 

I(X,Y,G)  ([Procedure  p(X1:t1;  .  .  .  ;Xm:tm;  VAR  Y1:u1;  .  . .  ;  VAR  Yn:un);  B]|  0(X,Y,G) 

where  p  has  the  procedure  declaration 

PROCEDURE  p(X1  :t1 ;  .  .  .  ;Xm:tm;  VAR  Y1  :u1 ;  . . .  ;  VAR  Ynsun); 

GLOBAL  GR.  VAR  GW; 

ENTRY  I(X,Y,G); 

EXIT  0(X,Y,G); 

B; 

GR  are  the  readonly  global  variables, 

GW  are  the  read  write  global  variables, 

G  stands  for  the  set  of  all  global  variables,  GR  u  GW. 


Like  the  function  declaration  rule,  the  procedure  declaration  rule  assumes  that  the  value 
parameters  are  initialized  by  each  call  with  values  in  the  correct  range.  On  the  other 
hand,  there  is  nothing  unusual  about  procedures  that  work  correctly  with  uninitialized 
VAR  parameters.  Consider  a  simple  procedure  p  which  is  called  with  an  integer  j  and 
two  array  variables,  x  and  y,  and  assigns  x[j]  the  value  y[j]. 

TYPES  =  1..100; 

TYPE  arr  =  ARRAY!*]  OF  INTEGER; 

PROCEDURE  p(j:  s;  VAR  x,  y:  arr); 

BEGIN 
x[J]  :*  y[j]; 

END; 


Since  the  procedure  does  not  test  the  range  of  j  before  executing  the  assignment,  a  call  to 
p  will  produce  a  subscripting  error  unless  j  is  between  1  and  100.  Also,  the  actual 
variable  supplied  for  y[j]  must  have  been  assigned  a  value  before  the  call  to  p.  No 
other  restrictions  are  needed  to  assure  error  free  execution.  In  particular,  p  will  work 
regardless  of  whether  x  has  been  initialized,  and  regardless  of  whether  portions  of  y 
other  than  y[j]  have  been  initialized.  For  instance,  the  following  sequence  executes 


without  errors. 
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VAR  a,  b:  arr; 

VAR  k:  INTEGER; 

BEGIN 
k  :=  60; 
b[k]  :=  1000; 
p{k,  a,  b); 

X  now  a[50]  =  1000  X 

END; 


The  behavior  of  p  can  be  specified  by  providing  it  with  entry  and  exit  assertions. 
TYPE  s  =  1..100; 

TYPE  arr  *  ARRAY[s]  OF  INTEGER; 

PROCEDURE  p(j:  8;  VAR  x,  y:  arr); 

INITIAL  y  =  yO; 

ENTRY  DEF(y[J]); 

EXIT  y  =  yO  a  x[j]  =  y[J]; 

BEGIN 
x[j]  :=  y[J]; 

END; 


The  entry  assertion  states  that  y[j]  has  a  value  when  p  is  called.  Note  that  since  j  is  a 
value  parameter  with  a  subrange  type,  the  declaration  rule  assumes  that  it  will  be 
supplied  with  a  value  in  the  correct  range  —  this  will  be  checked  by  the  call  rule  The 
Initial  statement  simply  introduces  a  new  name  yO  to  stand  for  the  initial  value  of  y  at 
the  time  of  entry  to  the  procedure  The  exit  assertion  states  that  the  value  of  y  is 
unchanged,  and  that  x£j]  is  equal  to  y[jl 


To  summarize  the  point  of  this  example,  all  of  the  rules  for  subprograms  assume  that 
value  parameters  must  be  supplied  with  initialized  values  in  the  correct  range.  This  is 
our  interpretation  of  what  it  means  to  correctly  call  a  subprogram  with  a  value 
parameter.  No  such  assumption  can  be  made  for  VAR  parameters,  and  so  it  is 
necessary  to  describe  the  behavior  of  each  one  by  means  of  entry  and  exit  assertions. 


It  is  of  course  possible  for  there  to  be  implementations  of  Pascal,  in  which  calls  with 
value  parameters  will  produce  the  desired  results  in  some  cases  even  if  the  actual 
parameter  is  not  fully  initialized.  This  is  merely  an  artifact  of  certain  possible 
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implementation  techniques.  Our  definition  attempts  to  capture  what  is  meant  by  the 
language  itself,  and  is  intended  to  be  sufficiently  restrictive  to  be  consistent  with  all 
possible  implementations. 

As  was  mentioned  earlier,  the  initial  value  of  local  variables  is  not  specified  by  the 
function  or  procedure  declaration  rules.  Another  approach,  which  seems  reasonable  at 
first  glance,  is  to  assert  that  every  local  is  initially  undefined.  This  is  not  needed  in  the 
extended  semantics,  because  for  P  [[Aj  Q  to  be  valid,  every  variable  must  be  assigned  a 
value  which  is  DEF  before  its  value  is  used. 

The  declaration  rules  could  be  modified  to  specify  an  initial  value  for  locals,  but  this 
would  unnecessarily  complicate  the  definition  and  lead  to  confusion  in  applying  the 
extended  semantics.  It  would  be  possible  to  introduce  a  new  constant  C8  for  each  sort  to 
stand  for  the  initial  value.  The  axioms  would  be  changed  to  state  that  for  each  of  these 
constants,  -DEF(CS),  and  also  -<DEF(t)  for  terms  t  formed  by  selecting  components  of  Cg. 
For  each  local  L,  L=CS  would  be  added  as  a  premiss  in  the  declaration  rule  But  this  is 
an  unnecessary  complication.  Also,  it  does  not  accurately  model  the  implementation  of 
Pascal,  in  which  initial  values  are  left  unspecified  to  reduce  overhead.  For  this  reason, 
it  would  give  confusing  results  in  practice  If  a  program,  A,  never  used  two  variables  of 
the  same  sort,  x  and  y,  and  otherwise  executed  without  errors,  it  would  be  possible  to 
prove  that  the  variables  were  equal  after  the  program, 

P  {A}  x=y. 

Such  a  result  differs  from  the  implementation  and  probably  conceals  a  programming 
error. 


6.3.4  Procedure  call 

The  procedure  call  rule  requires  each  value  parameter  to  evaluate  without  runtime 
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error,  yielding  a  value  in  the  correct  range,  and  each  VAR  parameter  to  yield  a  location 
without  runtime  error. 

for  is  1 . m,  P  lEval  AiJ  Inrangm (Al,  ti), 

for  1=1 . n,  P  {[Locate  VlJ  True, 

I(X,Y,G)  {[Procedure  p(X1:t1;  .  . .  ;Xm:tm;  VAR  Y1:u1;  .  . .  ;  VAR  Yn:un);  Bj  OfX.Y.G), 

P  {[Eval  Al  j  .  .  .  ;Eval  Am;  Locate  VI;  . . .  {Locate  Vnj  u  G)  a  I(A,V,G) 

a  VZ.GW  (0(A,Z,GR,GW)  = 


P  |[P(A1 ....  ,Am,V1 ....  ,Vn)]J  Q 


VI 

Z1 


r> 

Zn 


(PCI) 


Each  of  the  actual  VAR  parameters,  Vi,  must  be  a  distinct  Pascal  variable  not  in  GW. 
Note  that  this  definition  depends  on  the  definition  of  substitution  when  Vi  is  not  an 
identifier. 


7.  Metatheory  of  the  extended  definition 

In  this  section,  we  discuss  some  properties  of  the  extended  definition  which  are  helpful 
in  reducing  the  complexity  of  program  specifications  and  the  length  of  proofs. 

By  itself,  the  extended  semantics  is  not  a  complete  solution  to  the  problem  of  verifying 
the  absence  of  common  errors.  In  practice,  there  are  two  main  kinds  of  difficulty  in 
doing  actual  verifications.  These  practical  difficulties  were  carefully  considered  in  the 
design  of  the  Runcheck  system. 

The  problem  of  redundancy  in  proofs  is  solved  in  Runcheck  by  a  special  simplifier 
which  efficiently  eliminates  redundant  verification  conditions. 

A  more  serious  problem  is  the  need  for  lengthly,  detailed  specifications  and  inductive 
assertions  in  programs.  Several  distinct  approaches  are  needed  to  deal  with  this 
problem.  In  Appendix  A,  we  discuss  the  derived  WHILE  rule,  which  shows  how  the 
extended  definition  reduces  the  need  for  detailed  documentation.  The  derived  WHILE 
rule  and  other  rules  are  logically  justified  by  certain  simple  properties  of  the  theory  of 
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the  extended  definition,  which  are  presented  in  the  remainder  of  this  section. 


7.1  Ordinary  Semantics  Lemma 

Any  specification  provable  in  the  extended  definition  is  also  provable  in  the  ordinary 
definition. 

7.1  If  P  IaU  Q,  then  H  P  {A}  Q. 

The  significance  of  this  lemma  is  that  all  specifications,  even  those  involving  DEF,  are 
theorems  of  the  ordinary  system.5  The  extended  definition  only  places  more  restrictions 
on  the  allowable  computations.  Consistency  of  the  extended  definition  is  a  direct 
consequence  of  this  lemma. 


7.2  Specification  lemma 

When  proving  complicated  specifications  for  a  program,  it  is  sometimes  helpful  to  prove 
the  specifications  without  considering  possible  runtime  errors,  and  then  prove  separately 
that  no  errors  occur.  In  this  way,  the  details  about  runtime  errors  can  be  isolated  in  the 
proof.  The  next  lemma  says  that  proofs  in  the  extended  definition  can  always  be 
factored  in  this  manner. 

t Demina  7.2  if  p  {a}  Q,  and  I-  PI  CaJ  Q1 ,  then  h  PaPI  I[A]J  OaQI  . 

The  reason  for  this  is  that  if  both  P  {A}  Q,  and  PI  |[A]|  Q1  can  be  proven  separately, 
then  it  is  always  possible  to  combine  the  proofs  to  show  PaPI  |[A])  QaQI. 

The  design  of  the  automatic  Documenter  in  Runcheck  is  based  on  this  lemma.  The 


5  In  tha  CM*  of  built  in  procedure*,  rt  it  nacwMry  to  ehooM  atightty  nonataadwd  definition*  if  tha  ro*uftin*  *y*twn  i*  to  b* 
conpMa  with  rwpoct  to  apocification*  involving  DEF.  Tho  'ordinary*  ayataai  that  w*  h#*a  in  wind  ha*  aiiono  atating  that  th* 
raault*  of  bud!  in  procoduroo  aueh  ao  READ  and  NEW  ar*  DEF. 
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documenter  constructs  inductive  assertions*  that  are  valid  in  the  ordinary  semantics. 
The  assertions  can  then  be  assumed  true  in  proofs  in  the  extended  semantics.  Thus  the 
documenter  does  not  have  to  consider  possible  runtime  errors  while  constructing  the 
invariants. 


7.3  LESSDEF  lemma 

One  of  the  basic  properties  of  the  extended  definition  is  that  if  P  |[S]|  Q,  holds,  S  cannot 
assign  an  uninitialized  value  to  any  variable.  Over  any  sequence  of  statements  that 
executes  without  runtime  error,  the  extent  of  variable  initialization  cannot  decrease 

LESSDEF{x,  y),  a  predicate  for  two  terms  of  the  same  sort,  is  defined  to  be  true  if  y  is  at 
least  as  completely  initialized  as  x. 

LD1 )  if  x  and  y  are  of  the  aame  simple  sort, 

LESSDEF(x,  y)  ■  DEF(x)=>DEF(y). 

LD2)  if  x  and  y  are  of  the  same  record  sort,  and  the  field  names  are  f  1 . fn, 

LESSDEF(x.y)  >  LESSDEF(x.f1,  y.f1)A  .  .  .  ALESSDEF(x.fn,  y.fn). 

LD3)  if  x  and  y  are  of  sort  ARRAY[a..b3  OF  t, 

LESSDEF(x,  y)  ■  (Vj  asj<b  =>  LESSDEF(x[j],  y[j])). 

LD4)  if  x  and  y  are  of  sort  REFCLASS(t)  for  some  t, 

LESSDEF(x,  y)  ■  (Vp*POINTERSTO(x)  LESSDEF(xcp=,  ycp=)). 

The  LESSDEF  lemma  says  that  for  any  variable  in  a  program  that  executes  without 
errors,  the  final  value  will  be  at  least  as  fully  initialized  as  the  initial  value 

Lemma  7.3  If  h  P  |[A]j  True,  and  v  is  a  declared  variable  identifier  then, 

F  P  /\  v'»v  [[A]]  LESSDEF(v‘,  v) 

where  v*  is  a  new  identifier  not  appearing  In  P,  A,  or  the  scope. 

*  Refer  to  [2]  for  do tails  of  the  documentor. 
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In  Runcheck,  the  lemma  is  used  to  reduce  the  need  for  detailed  assertions  on  loops  and 
procedures.  If  a  variable  is  known  to  be  DEF  before  entering  a  loop,  it  is  not  necessary 
to  state  in  the  invariant  that  it  continues  to  be  DEF.  Similar  assertions  about  VAR 
parameters  can  be  omitted  from  procedure  specifications. 


Example  4:  Merging  two  sorted  arrays 


This  example  shows  how  Runcheck  uses  the  Lessdef  lemma  to  reduce  the  need  for 
repetitious,  detailed  assertions.  The  program  takes  as  input  previously  sorted  arrays  A 
and  B  of  length  100  and  merges  their  contents  into  the  array  C,  which  has  length  200. 
The  user  has  supplied  only  an  ENTRY  assertion  saying  that  A  and  B  are  fully 
initialized,  and  an  EXIT  assertion  saying  that  C  is  fully  initialized.  The  interesting 
aspect  of  this  example  is  that  the  initialization  of  C  takes  place  in  two  loops.  The  first 
loop  partially  initializes  C,  merging  elements  from  A  and  B  until  either  A  or  B  has  been 
completely  transferred.  Then  the  initialization  of  C  continues  in  either  the  second  loop 
or  the  third  loop. 

TYPE  INARR=ARRAY[1 : 1 00]  OF  INTEGER; 

TYPE  OUTARR=ARRAY[ 1 :200]  OF  INTEGER; 

VAR  I,J,N:INTEGER; 

VAR  A,B:INARR;  C-.OUTARR; 

ENTRY  DEF(A)aDEF(B); 

EXIT  DEF(C); 

BEGIN 

N:=100; 

I:»1; 

J:*1; 

INVARIANT  DEF  RANGE  (I,  I+J-2,  C) 

A  f  £/  A  1S.N+1  A  1£J  A  J£N+1 
WHILE  (IsN)  AND  (JsN)  DO 
BEGIN 

IF  ADlsBCJ]  THEN  BEGIN  C[I+J-1]:=A[I];  I:*I+1  END 
ELSE  BEGIN  J:«J+1  END; 

END; 

/V/j 

INVARIANT  DEF  RANGE  (P+N,  I+N-1 ,  C)  a  I'£I  a  HN+1 
WHILE  ESN  DO  BEGIN  C[I+N]:*A[I];  I: *1+1  END; 

J'-J, 

INVARIANT  DEF  RANGE  (J'+N,  J+N-1,  C)  a  J‘£J  a  JS.N+1 
WHILE  J£N  DO  BEGIN  C[J+N]:*B[J];  J:*J+1  END; 

END 
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The  system  will  verify 

DEF(A)  a  DEF(B)  [[body]]  DEF(C) 

ie.,  that  the  program  does  not  have  any  execution  errors  and  that  no  elements  of  C  are 
missed.  All  of  the  other  variables  are  initialized  before  the  first  loop.  Still,  it  is 
necessary  to  prove  that  they  are  DEF  each  time  they  are  accessed.  In  the  case  of  a 
variable  such  as  I,  Runcheck  uses  the  Lessdef  lemma  to  infer  that  it  has  a  value 
everywhere  in  the  program  after  the  assignment  I:-1.  Even  though  I  is  changed  on  the 
first  loop,  it  is  not  necessary  to  write  DEF(I)  (or  A,  B,  J,  N)  as  an  invariant. 

In  many  array  programs,  the  arrays  are  either  supplied  as  fully  initialized  parameters, 
or  are  initialized  at  the  beginning.  Without  the  Lessdef  lemma,  it  would  be  necessary  to 
have  invariants  repeating  the  fact  that  an  array  or  other  data  structure  is  DEF  at 
various  points  within  a  program. 

Consider  now  the  more  complicated  case  of  proving  DEF(C).  The  system  automatically 
generates  the  statements  shown  in  bo/d  /ia//cs.  By  examining  the  first  loop,  one  can  see 
that  at  any  time,  values  have  been  assigned  to  the  positions  C[l], . . .  ,C[I+J-2].  This 
fact  is  discovered  by  the  system  and  is  expressed  in  the  invariant  as 

DEFRANGE(1,  I+J-2,  C). 

DEFRANGE  is  a  special  predicate  used  to  express  that  a  subrange  of  an  array  is  DEF. 
Its  definition  is 

DEFRANGE(x,y,a)  >  (Vi  xsisy  =>  DEF(a[l])). 

The  invariant  for  the  second  loop  states  that  C[IVN], . . .  ,C[I+N-1]  are  DEF,  where  I* 
stands  for  the  value  of  I  before  entering  the  second  loop.  Similarly,  the  assertion  for  the 
third  loop  states  that  C[JVN], . . .  ,C[J+N-1]  have  been  assigned  values.  The  system 
also  produces  the  arithmetic  inequalities  shown  on  each  loop. 


To  be  able  to  prove  the  exit  assertion,  DEF(C),  it  is  necessary  to  show  that  all  of 
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C[l], . . .  ,C[200]  have  values  after  the  third  loop.  Notice  that  each  invariant  only 
describes  the  initializations  done  by  its  own  loop.  For  instance,  the  third  invariant  only 
deals  with  the  last  part  of  C,  and  does  not  repeat  the  fact  that  the  first  part  of  C  is 
initialized  by  the  first  loop.  Runcheck  uses  the  Lessdef  lemma  to  infer  that  the  first  part 
of  C  continues  to  be  DEF,  even  though  that  fact  is  not  included  in  the  later  invariants. 
Thus  the  invariants  shown  are  sufficient  to  prove  that  C  is  fully  initailized  on  exit. 
The  documenter's  assertions  are  also  sufficient  to  show  that  the  program  executes  safely. 


7.4  Inrange  lemma 

The  Inrange  lemma  says  that  a  program  for  which  P  |[A]]  True  holds  cannot  cause  the 
value  of  a  subrange  variable  to  become  out  of  range  (when  started  in  a  state  which 
satisfies  P).  If  a  subrange  variable  is  known  to  always  be  DEF  at  some  point  in  a 
program  that  executes  without  errors,  then  the  variable  must  be  Inrange  at  that  point. 
To  begin,  we  define  Inrange*,  a  formula  constructor  similar  to  Inrange  The  difference 
between  the  two  is  that  Inrange  asserts  that  a  subrange  variable  is  in  the  correct  range 
and  is  always  true  for  other  types,  while  Inrange*  asserts  that  every  subrange  variable 
contained  as  a  component  of  its  argument  is  in  the  correct  range 

Definition.  Inrange*  is  a  mapping  <paacal  variable)  x  <type>  -*  <formula>.  For  simple 
types,  InrangeHv,  t)  is  true  if  Inrange(v,  t)  is.  InrangeHv,  t)  is  true  for  a  compound 
type  if  Inrange*(c,  fype(c))  is  true  for  every  component  c  of  v. 

The  idea  of  the  Inrange  lemma  is  a  characterization  of  the  possible  sets  of  states  of 
programs  that  always  execute  without  runtime  errors.  Any  actual  execution  must  begin 
in  the  outermost  block  with  all  variables  uninitialized.  Data  needed  by  the  program  is 
obtained  by  a  READ  procedure  which  always  returns  values  that  are  DEF  and  Inrange. 
Given  that  the  program  always  runs  without  errors,  what  do  we  know  about  the  set  of 
all  possible  states  if  it  terminates?  Variables  that  the  program  assigns  to  every  time  it  is 
run  will  always  be  DEF  and  Inrange*  at  the  end.  Variables  that  are  never  touched  by 
the  program  will  be  completely  unspecified  at  the  end.  Variables  assigned  to  on  some 
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runs  but  not  on  others  can  be  -DEF  at  the  end,  or  can  have  a  value  dependent  on  the 
values  of  the  other  variables.  If  the  value  is  dependent  on  the  other  variables,  it  must 
be  an  Inrange*  value  The  essential  point  is:  If  a  program  determines  the  value  of  a 
variable,  the  value  must  be  Inrange*.  If  a  variable  is  always  DEF  at  the  end  of  a 
program,  then  it  must  always  be  Inrange*. 

Definition.  Let  X  be  the  set  of  simple  components  of  the  declared  variables.  For 
instance  If  v  Is  declared 

VAR  v:  ARRAY  [1..2]  OF  RECORD  f:INTEGER;  fl: BOOLEAN  END; 

then  X  will  contain  the  variables  v[1].f,  v[2].f,  v[1].g,  v[2].g.  Note  that  X  is  a  set  of 
variables,  not  a  set  of  the  values  the  variables.  A  state  of  a  program  is  an  assignment 
of  values  to  each  of  the  elements  of  X.  To  refer  conveniently  to  the  value  of  a  given 
variable  y«X  and  the  overall  state,  we  will  use  the  notation  that  the  y- form  of  a  state  is 
a  pair  <z,Z>,  where  z  stands  for  the  value  of  y,  and  Z  stands  for  the  values  of  the 
variables  In  X-(y}. 

A  set  S  of  states  is  DEF-convex  for  the  variable  y,  Iff 
for  all  Z, 

(Vz  <z,Z>«Sy  a  DEF(z))  Implies  (Vw  <w,Z>cSy  3  /nrange(w,  fype(y))). 
where  Sy  is  the  set  of  states  In  S,  represented  In  y-form. 

A  set  of  states  of  X  is  DEF-convex  Iff  it  Is  DEF-convex  for  every  variable  In  X.  A 
formula  containing  free  occurrences  of  declared  variables  is  DEF-convex  iff  it  is 
satisfied  by  a  DEF-convex  set  of  states. 

Examples:  assume  the  declared  variables  are 
VAR  x:  INTEGER; 

VARy:  1..10; 

(7.1)  True,  False  both  DEF-convex 

(7.2)  y*2  DEF-convex 

(7.3)  y*40  not  DEF-convex 

(7.4)  y»40  DEF-convex 

(7.6)  DEF(y)  not  DEF-convex 
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(7.8)  x«1  o  y«2  DEF-convex 

(7.7)  x»1  a  y«40  not  DEF-convex 

If  S  is  the  set  of  final  states  of  a  program  that  does  not  have  runtime  errors,  then  S  is 
DEF-convex.  In  the  examples,  a  program  can  set  y  to  2,  so  7.2  is  DEF-convex,  but  7.3 
cannot  be  DEF-convex  because  40  is  out  of  range.  Although  y*40  is  DEF-convex,  it  is 
not  a  possible  set  of  final  states  —  the  DEF-convex  sets  include  more  than  final  states 
sets.  To  attempt  to  characterize  only  final  states  would  require  much  more  detail  than 
we  need  here.  Note  that  7.5  is  too  weak  to  be  a  final  set  of  states  because  it  includes 
both  7.2  (a  possible  set)  and  7.3  (an  impossible  set). 

Lemma  7.4a  If  a  program  is  started  in  a  DEF-convex  set  of  states  and  always 
executes  without  runtime  error,  then  the  final  set  of  states  will  be  DEF-convex. 

It  follows  that  if  a  program  always  leaves  a  variable  DEF  when  it  halts,  the  variable 
must  be  Inrange#  at  the  end. 

Lemma  7.4b  If  B  is  a  Pascal  statement,  pv  is  a  Pascal  variable,  P  is  a  DEF-convex 
predicate,  and  h  P  ][B]|  DEF(pv),  then  h  P  |[B]|  /nrange*(pv,  fypa(pv)). 

The  restriction  on  P  in  this  lemma  is  necessary.  Recall  that  extended  semantics  does  not 
specify  the  initial  values  of  variables,  and  that  subrange  type  variables  have  the  same 
sort  as  the  base  type  of  the  subrange.  Consequently,  there  is  nothing  that  says  a 
subrange  variable  cannot  be  out  of  range  if  its  value  is  not  assigned  by  the  program. 
The  following  formula  is  a  a  theorem,  even  if  the  variable  S  declared  with  a  subrange 
of  only  1-100. 

I-  S«600  ffempfy] |  DEF(S)  a  S«600. 

Of  course,  the  extended  definition  checks  that  any  program  that  uses  the  value  of  S  first 
assigns  it  a  value  in  the  proper  range. 

Runcheck  makes  use  of  a  restriction  that  the  entry  assertion  for  the  outermost  block  of  a 
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program  must  be  DEF-convex.7  With  this  assumption,  Runcheck  can  infer  bounds  on 
the  value  of  a  subrange  variable  if  it  is  known  to  be  DEF.  In  some  cares,  this  can 
permit  lengthly  assertions  to  be  omitted.  For  instance,  if  a  complex  data  structure 
contains  subrange  variables  and  the  entire  data  structure  is  DEF,  bounds  for  the 
subrange  variables  can  be  deduced  without  any  additional  assertions.  By  induction  on 
the  depth  of  procedure  calls,  the  lemma  can  also  be  applied  to  formal  parameters  when 
reasoning  about  a  procedure  body.  Since  a  value  parameter  v  must  be  DEF  on  entry, 
Inrangm*(v,t)  must  be  true  initially.  Variable  parameters  do  not  have  to  be  DEF  on 
entry,  but  if  the  value  is  used  somewhere  in  a  procedure  body  it  must  be  possible  to 
prove  that  the  variable  is  DEF  and  the  Inrange  lemma  applies  at  that  point. 


Example  6:  Constructing  a  Spanning  Tree. 

The  following  program  is  a  simple  algorithm  [12]  for  finding  a  spanning  tree  of  an 
undirected  loop-free  graph  with  E  edges  and  V  vertices.  If  the  graph  is  disconnected,  it 
grows  a  spanning  forest.  The  graph  is  entered  as  a  table  of  edges  in  the  arrays  IA  and 
JA,  so  that  the  vertices  of  the  k1*1  edge  are  IA[k]  and  JA[k].  The  program  stores  the 
indices  of  the  spanning  tree’s  edges  in  T[l], . . .  ,T[V-P],  where  P  is  set  to  the  number  of 
trees  in  the  spanning  forest. 

This  example  illustrates  the  use  of  subranges  and  the  inrange  lemma  to  strengthen  the 
entry  assertion  of  a  procedure.  Since  IA  and  JA  are  tables  of  vertices,  they  have  been 
declared  as  arrays  of  subrange  values  !:V.  It  is  typical  in  graph  manipulating  programs 
to  use  a  value  stored  in  one  array  to  compute  an  index  into  another  array.  Here,  the 
variable  I  is  set  to  LACK]  and  then  VA[I]  is  accessed.  For  the  latter  access  to  be  in  the 
subscript  range  l:V  of  VA  on  every  iteration,  all  elements  of  IA  must  have  been  in  the 

7  In  an  actual  Pseeai  program,  no  aoeumptione  can  ba  made  about  the  initial  vaiuaa  of  variabiaa  declared  in  an  outormoot  Mock.  To 
bo  strictly  i  aaSatic,  tha  verifier  ahautd  not  permit  entry  assertions  there.  They  are  permitted  aa  a  sms*  convenience,  a  main  Modi 
with  an  entry  aaaortjan  ie  considered  to  ba  a  shorthand  for  a  procedure  with  gtobete.  Tha  aignificanee  of  this  io  that  tha  truth  of 
the  entry  eeeertlon  muat  bo  aaeurod  by  eon*  ceding  program  ie  it  ie  peesMa  to  dociere  a  procedure  with  an  entry  aeoertinn  that 
la  net  0€F -convex,  but  Ho  actual  eet  of  entry  etatee  ie  then  a  DEF-convex  restriction  of  the  declared  entry  condHIon, 
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range  initially.  Because  IA  and  JA  are  value  parameters,  their  initial  values  must  be 
DEF,  and  by  the  inrange  lemma,  Runcheck  can  infer  that  the  elements  are  in  the  correct 
range.  Similar  reasoning  is  required  for  other  array  accesses. 

VAR  E,V:INTEGER; 

PROCEDURE  SPANNING(IA,JA:  ARRAY!  1:E]  OF  1:V; 

VAR  P:  INTEGER; 

VAR  T:  ARRAY!  1  :V-1]  OF  INTEGER); 

ENTRY  DEF(E)  a  DEF(V)  a  IsE  a  2sV; 

EXIT  TRUE; 

VAR  I,J,K,C,NfR:  INTEGER; 

VAR  VA:  ARRAY!  1  :V]  OF  INTEGER; 

BEGIN 

C:*0; 

N:*0; 

FOR  K:«1  TO  V  INVARIANT  1  £K  A  KZV+1  a  DEF  RANGE  ( 1.K-1.VA) 

DO  VA[K]:=*0; 

FOR  K:*1  TO  E 

INVARIANT  1£K  a  K&E+1  a  O&N  a  0£C  a  N&K-1  a  C£K~1  a  K&V+N-1 
DO  BEGIN 

IF  K-N-V-1  THEN  GOTO  1 ; 

I:=IA[Kj; 

J:«JA!K]; 

IF  VA[I]=0  THEN 
BEGIN 

TIK-N]:»K; 

IF  VA!J>0  THEN  BEGIN 
C:»C+1; 

VAIJ]:«C; 

VACI]:-C; 

END 

ELSE  VA!I]:»VA!J]; 

END 

ELSE  IF  VA!J]«0  THEN 
BEGIN 

TtK-N]:»K;  VA!J]:-VA!Q; 

END 

ELSE  IF  VA!I>VA!J]  THEN 
BEGIN 

TCK>N]:«K;  I:»VAII];  J:*VAIJ]; 

FOR  R:«1  TO  V  INVARIANT  1£R  a  R£V+1 
DO  IF  VA!R]>J  THEN  VA!R]:>I; 

END 

ELSE  Ns«N+1 
END; 

1:  P:«V-E+N; 

END; 
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Note  that  IA  and  JA  could  have  been  declared  as  arrays  of  INTEGER,  and  the  restriction 
on  the  values  could  have  been  part  of  the  entry  assertion.  Expressing  the  restriction 
would  involve  a  quantified  assertion  such  as 

Vx  (isxsE  3  1*IAEx]*V). 

This  is  both  more  difficult  to  write  than  the  subrange  type  specification,  and  it  causes 
difficulty  in  theorem  proving. 


8.  Generalizations  of  the  extended  semantics 


8.1  Dynamic  subranges 

There  are  programming  languages  more  flexible  than  Pascal,  which  allow  declaration  of 
dynamic  subranges.  ADA  in  particular,  has  flexible  dynamic  type  declarations.  A 
reasonable  extension  to  Pascal  is  to  permit  subrange  declarations  involving  expressions, 
eg. 

TYPE  s  =  1..2*x; 

The  expressions  for  the  bounds  are  evaluated  each  time  the  scope  is  entered,  and  the 
range  of  s  is  fixed  for  the  duration.  Dynamic  arrays  can  be  obtained  by  using  a 
dynamic  subrange  as  the  index  type  for  an  array  etc. 

The  extended  semantics  can  be  adopted  to  handle  dynamic  subranges  by  defining 
Inrangefe,  a)  to  refer  to  the  values  obtained  when  the  expressions  for  the  bounds  on  s 
are  evaluated.  The  declaration  rules  for  functions  and  procedures  would  be  changed  to 
check  for  error  free  evaluation  of  the  expressions  in  the  type  declarations.  Also, 
depending  on  the  restrictions  in  the  programming  language,  renaming  would  be  needed 
to  distinguish  between  the  initial  values  of  the  variables  appearing  in  the  type 
declaration  and  the  values  assigned  after  the  dynamic  declaration  was  evaluated. 
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8.2  Bounds  on  dopth  of  recursion  and  dynamic  variable  allocation 

Like  the  bound  for  arithmetic  overflow,  bounds  on  recursion  and  heap  storage  are 
implementation  dependent.  In  critical  applications,  the  actual  bounds  may  be  set  in 
advance,  and  one  might  want  to  verify  that  the  available  storage  will  be  sufficient.  In 
other  cases,  the  particular  bound  is  not  important,  but  it  might  be  useful  to  verify  that  a 
program  does  not  attempt  unlimited  recursion  etc 


To  describe  bounds  on  depth  of  calls,  two  new  undeclared  integer  variables  are 
introduced  in  the  procedure  call  rule.  The  variable  Stkslze  represents  the  maximum 
depth  of  calling;  Stkptr  represents  the  current  depth.  The  procedure  call  rule  is 
modified  to  enforce  a  restriction  that  StkptrsStksIze.  Neither  variable  can  be  assigned 
to  by  the  program.  Stkptr  is  0  on  entry  to  a  main  program,  and  each  level  of  function 
or  procedure  calling  increases  it  by  I.  With  these  additions,  the  procedure  call  rule  is 

for  1*1 ....  ,m,  P  [Eval  Ai]]  Inrmngmi Ai,  tl), 
for  1=1, . . .  ,n,  P  [Locate  Vi]J  True, 

I(X,Y,G,S)  [Procedure  p(X1:t1;  . . .  ;Xm:tm;  VAR  Ylsul;  .  .  .  ;  VAR  Yn.un);  B]J  0(X,Y,G,S); 

P  [Eval  AI ; . . .  ;Eval  Am;  Locate  VI ;  . . .  {Locate  VnJ  Df*jo1nt-smt{V  u  G) 
a  I( A, V,G, Stkptr ♦  1  .Stkslze)  V1  Vn 

a  YZ.GW  «XA,Z,GR,GW,Stkptr+1  .Stkslze)  s  o|  ...  ) 

a  Stkptr*  Is  Stkslze  Z1  Zn 

-  (PC2) 

P  [p(A1, .  . .  ,Am,V1 , .  .  .  ,Vn)]l  Q 


where  S  stands  for  the  set  of  variables  {Stkptr,  Stkslze}.  Note  that  in  practical 
applications,  it  might  be  important  to  use  some  measure  of  the  actual  amount  of  stack 
space  used  by  a  program  instead  of  just  the  depth  of  recursion.  It  would  be  simple  to 
define  a  different  function  that  depended  eg.,  on  the  number  and  types  of  variables  in 
the  procedure,  for  incrementing  Stackptr.  To  measure  the  heap  storage  used,  counters 
can  be  added  to  the  rules  for  NEW  statements. 
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6:  Recursive  Tree  Traversal. 

Type  PTR  is  defined  to  be  a  pointer  to  a  record  with  A  and  £  fields  of  type  PTR. 
The  recursive  procedure  WALK  simply  does  a  depth  first  walk  on  a  tree  P.  To  avoid 
stack  overflow,  P  must  not  lead  to  any  cyclic  list  structure  and  there  must  be  enough 
room  on  the  stack  for  DEPTH(P,  #REC)  procedure  calls,  so  Stacksize  must  be  greater  than 
or  equal  to  Stackptr+DEPTH(P,  #REC).  Stackptr  and  Stacksize  are  declared  as  VIRTUAL, 
variables  to  indicate  that  they  may  appear  in  assertions,  but  may  not  be  used  in 
executable  parts  of  the  program.  ACYCLIC  and  DEPTH  are  user  defined  symbols  for 
documenting  programs  that  operate  on  trees.  The  assertion  DEF(#REC)  states  that  every 
allocated  record  in  the  heap  of  type  REC  is  fully  initialized.  This  assures  that  WALK 
will  not  encounter  uninitialized  dynamic  variables. 

TYPE  PTR=tREC; 

REC=RECORD  A:PTR;  B:PTR  END; 

VIRTUAL  VAR  Stackptr,  Stacksize:  INTEGER; 

PROCEDURE  WALK(PsPTR); 

ENTRY  ACYCLIC(P,  #REC)  a  DEF(#REC)  a  Stacksize  Z  Stackptr+DEPTH(P,  #REC); 

EXIT  TRUE; 

BEGIN 

IF  P»NIL  THEN  BEGIN  WALK(PT.A);  WALK(Pt.B)  END; 

END; 

The  proof  depends  on  two  lemmas  about  acyclic  list  structure  If  p  is  a  pointer  to 
acyclic  list  structure  in  the  reference  class  «r,  then  pti  points  to  acyclic  list  structure  If 
p  points  to  acyclic  list  structure,  then  the  depth  of  pTi  is  less  than  the  depth  of  p. 

ACYCLIC(p,  «r)  a  pxNIL  =>  ACYCUCCpt.f,  #r) 

ACYCUC(p,  #r)  a  p*NIL  a  DEPTH(pt.f,  #r)  s  DEPTH(p,  #r)-1 
(where  .f  is  A  or  .B) 

The  lemmas  are  provided  by  the  user  to  the  system  in  the  form  of  inference  rules  [IS] 
to  be  used  by  the  theorem  proven 
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8.3  Procedure  Parameters 

Procedure  (and  function)  formal  parameters  in  Pascal  have  the  weakness  that  the 
arguments  of  formal  procedures  are  not  declared.  It  is  not  possible  to  determine 
syntactically  whether  a  procedure  parameter  is  railed  with  the  right  number  and  ty^  of 
arguments.  It  is  a  simple  matter  to  tighten  the  language  by  introducing  more  detailed 
declarations;  if  this  is  done,  the  usual  syntactic  checks  can  be  performed  for  procedure 
parameters,  and  they  can  be  included  in  the  axiomatic  definition*  As  an  example  of  a 
program  using  more  detailed  declarations,  Sum(a,b/)  computes  the  sum  of  f(x)  when  x 
ranges  from  a  to  b. 

FUNCTION  Sum(a,b:INTEGER;  f:FUNCTION(INTEGER):INTEGER):  INTEGER; 

VAR  ^INTEGER; 

BEGIN 

8:*0; 

FOR  i:s«  TO  b  DO  s:*s+f(i); 

Sum:>a 

END; 

Clarke  [1]  shows  that  any  sound  and  complete  axiomatic  definition  of  procedure 
parameters  in  a  language  with  recursion,  static  scoping,  read  write  global  variables,  and 
internal  procedure  declarations,  must  depend  on  some  method  of  making  assertions 
about  the  state  of  the  runtime  stack  of  local  variables.  Such  an  approach  would  greatly 
complicate  both  the  semantic  definition  and  the  process  of  specifying  and  verifying 
programs.  Instead,  we  will  make  the  restriction  that  functions  or  procedures  with 
globals  may  not  be  passed  as  parameters.  With  this  restriction,  procedure  parameters 
can  be  introduced  in  a  natural  manner. 

The  specification  method  will  be  to  declare  an  Entry  and  Exit  assertion  for  each  formal 
parameter,  these  will  be  used  in  the  ordinary  call  rules  when  the  formal  is  called.  When 
a  procedure  parameter  is  passed,  the  call  rules  will  check  that  the  actual  satisfies  the 
declared  specifications  of  the  formal. 
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Nesting  of  procedure  parameters  is  permitted  to  any  finite  depth.  Thus  a  procedure  can 
have  a  procedure  parameter  which  takes  another  procedure  as  one  of  its  parameters,  but 
self  application  of  procedures  is  not  possible  The  various  possibilities  are  illustrated  in 
the  example  below:  a  procedure  p  has  value  parameters  U,  variable  parameters  V,  a 
function  parameter  s,  and  a  procedure  parameter  q.  The  procedure  q  takes  a  function 
parameter  r. 

The  main  specification  given  for  p  is  a  set  of  entry-exit  assertions,  Ip  and  Op.  An 
occurrence  in  the  assertions  of  the  formal  function  parameter  s  as  a  function  sign  stands 
for  the  value  of  the  functional  parameter,  and  not  for  a  constant  function.  The 
assertions  may  be  thought  of  as  first  order  schemes,  which  the  procedure  call  rule  adopts 
to  particular  calls  by  substituting  the  actual  function  sign  for  the  formal  s.  To 
distinguish  this  kind  of  substitution  from  sustitution  for  free  variables,  the  following 
notation  will  be  used. 

Notation:  Off  J(X)  is  a  formula  containing  the  function  sign  f  and  free  variables  X.  After 
a  particular  formula  Q[f](X)  has  been  Introduced,  we  will  write  G£g](Y)  to  stand  for  the 
result  of  replacing  the  function  sign  f  by  g  and  substituting  Y  for  X  in  a 

Each  formal  procedure  parameter  has  a  declaration  in  p  of  its  entry-exit  assertions. 
The  declarations  are  like  ordinary  procedure  declarations,  except  that  the  reserved  word 
FORMAL  is  used  in  plan  of  the  procedure  body.  Since  the  formal  parameter  q  takes  a 
function  r  as  an  argument,  the  declaration  of  q  has  a  declaration  for  r  nested  inside  it. 


so 


Procedure  Parameters 


Declaration*  with  procedure  and  function  formal*. 

PROCEDURE  p(U;  VAR  V; 

FUNCTION  s(Y):t; 

PROCEDURE  q(W;  Function  r(Y):t)); 

FUNCTION  s(Y):t;  %  specifications  of  formal  parameter  s  % 

ENTRY  Is(Y); 

EXIT  Os[s](Y,s); 

FORMAL; 

PROCEDURE  q(W;  Function  r(Y):t);  %  specifications  of  q  % 

Function  r(Y):t;  %  specifications  of  formal  parameter  of  q  % 
ENTRY  Ir(Y); 

EXIT  Or[r](Y,r); 

FORMAL; 

ENTRY  Iq[r](W); 

EXIT  Oq[r](W); 

FORMAL; 

GLOBAL  GR,  VAR  GW; 

ENTRY  IpCsmV.G); 

EXIT  Op[a)(U,V,G);  %  specifications  of  p  X 

BEGIN  pbody  END;  %  executable  statements  of  p  X 


Notation:  In  the  following  rules,  entry-exit  assertions  enclosed  in  brackets,  «I,0>,  are 
Included  In  the  procedure  headers  as  an  abbreviation  for  the  full  procedure  declarations 
as  shown  above. 


The  idea  of  the  declaration  rule  is  to  use  the  declared  entry  exit  specifications  of  the 
formal  parameters,  in  this  case  s  and  q,  to  prove  the  specifications  for  p.  Then  for  calls 
to  p,  the  call  rule  will  check  that  the  actual  function  and  procedure  parameters  satisfy 
the  specifications  declared  for  s  and  q. 


Example  Procedure  declaration. 

{Ia(Y)  ([Function  s(Y):t;  FORMAL]]  Os[s](Y,s),  (8. 1 ) 

IqM(W)  [Procedure  q(W;  r:«Ir,Or»);  FORMAL]]  Oq[r](W)>  (8.2) 

h  IpCs](U,V,G)  a  DEF(U)  a  Inrange(Ul,tl)  [pbody]]  OpCs](U,V,G)  (8.3) 


IpCe](U,V,G) 

[Procedure  p(U;  V;  s:<Is,Os>;  q(W;  r:«Ir,Or»):«Iq,Oq»);  pbody]]  Op[s](U,V,G) 


k 
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Example  Procedure  call. 

for  i»1, .  . .  ,m,  P  |[Eval  Al]|  Inrmngmi Al,  tl),  (8.4) 

for  1=1 ... .  ,n,  P  [Locate  Bl|  True,  (8.6) 


IpCs](U,V,G) 

[Procedure  p(U;  V;  s:<Is,Oa»;  q(W;  r:dr,0r»:«Iq,0q»;  pbodyH  Op[a](U,V,G), 

„  (8.8) 

Ie(Y)  [Function  c(Y)st;  cbodyCY]]]  0e[c3(Y,c),  (8.7) 


Iqtr](X)  [Procedure  d(X;  r:«Ir,Or»);  dbody[X;r]J  Oq[r](X), 

P  [Eval  Al ; . . .  ;Eval  Am;  Locate  B1 ; . . .  ;Locate  Bn|  DiaJoint-st(B  u  G) 
a  Ip[c](A,B,G)  _ 

a  VZ.GW  (Op[c](A,Z,Gfi,GW)  a  Gil  . . .  ,  ) 
_ <Z1 _ _Zn _ _ 

P  [p(A,B,c,d)]l  Q 


(8.8) 


(8.9) 


In  the  declaration  rule,  the  specifications  of  the  procedure  parameters  s  and  q  are  used 
as  assumptions  (8.1  and  8.2)  for  proving  the  entry-exit  specifications  of  the  main 
procedure  p.  This  rule  can  be  justified  by  the  type  requirements,  which  do  not  permit 
self  applications  that  could  lead  to  circular  proofs. 

For  the  procedure  call,  conditions  8.4,  8.5  and  8.6  are  as  before.  Condition  8.7  checks 
that  the  actual  function  parameter  c  satisfies  the  specifications  of  s;  8.8  checks  the  entry- 
exit  assertions  for  the  actual  procedure  d. 


9.  Discuaaion 

Our  definition  of  Pascal  describes  some  important  aspects  of  the  language  that  have  not 
been  included  in  previous  axiomatic  definitions.  We  began  by  recalling  that  a  proof  of 
P  {A}  Q  does  not  give  any  assurance  that  a  program  will  be  free  from  runtime  errors, 
and  argued  that  a  stronger  relation,  P  [A]]  0,  is  a  better  indicator  of  program  reliability. 
As  part  of  our  presentation  of  Pascal  semantics,  we  have  developed  a  precise  and 
comprehensive  definition  of  the  evaluation  of  expressions  and  Pascal  variables,  using 
partial  correctness  statements  to  account  for  function  calts  within  expressions.  Previous 
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axiomatic  definitions  have  not  dealt  fully  with  the  semantics  of  function  calls  within 
expressions.  We  then  used  the  definition  of  evaluation  to  define  Pascal  statements, 
procedures  and  functions.  The  complete  definition  is  very  concise,  although  it  captures 
many  complicated  details  of  the  language.  One  of  the  crucial  advantages  of  our 
axiomatic  technique  is  its  simplicity;  absent  are  the  clouds  of  obscuring  notation 
commonly  found  in  denotational  definitions.  The  clarity  and  simplicity  of  our  approach 
are  of  greatest  importance  when  the  definition  is  actually  used  to  verify  programs; 
because  program  specifications  and  the  proofs  are  also  simple  and  understandable,  the 
user  is  free  to  concentrate  on  the  real  issues  surrounding  a  program  and  its  correctness. 

Our  axiomatic  definition  has  been  part  of  a  development  with  the  goal  of  building  a 
useful  automatic  verifier.  This  has  influenced  the  definition  in  several  ways.  One 
important  requirement  for  useful  verification  is  to  have  convenient  methods  for 
specifying  programs.  In  Runcheck,  specifications  are  greatly  simplified  by  having  a 
single  predicate,  DEF,  as  the  basis  of  all  predicates  referring  to  variable  initialization. 
The  Lessdef  and  Inrange  lemmas  also  eliminate  the  need  for  certain  kinds  of  detail  in 
specifications,  iftthough  the  idea  of  derived  inference  rules  is  by  no  means  new,  this 
technique  is  more  useful  in  practice  than  has  been  previously  realized. 


Appendix  A:  Development  of  the  WHILE  Rule. 


This  section  explains  the  actual  While  rule  used  in  Runcheck. 
section  6.2, 

P  =  I, 

I  ([Eval  B;  ASSUME  B;  S]J  I, 

I  [Eval  B]  -B  a  Q 

P  [INVARIANT  I  WHILE  B  DO  Sj  Q 


The  rule  of  section 


(WHILED 


does  not  help  to  reduce  the  need  for  detailed  invariants  and  is  not  convenient  to  use  in 
practice  The  implemented  rule  has  four  additional  features: 
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1)  It  adds  an  invariant  referring  to  the  evaluation  of  the  While  test,  B.  B  is  evaluated 
once  on  each  iteration,  and  so  it  must  be  an  invariant  of  the  loop  that  B  can  evaluate 
safely. 

2)  It  makes  it  unnecessary  for  the  invariant  to  refer  to  variables  which  cannot  be 
changed  in  the  loop.  This  has  been  previously  called  a  frame  axiom  [8, 1 4]. 

S)  It  applies  the  Lessdef  lemma,  adding  to  the  invariant  the  information  that  variables 
changed  on  the  loop  cannot  become  less  fully  initialized. 

4)  Runcheck’s  automatic  documentor  generates  invariants  which  are  valid  in  the 
unextended  semantics.  Because  proofs  in  the  extended  semantics  can  be  separated,  with 
part  done  in  the  ordinary  semantics  (Specification  lemma),  the  extended  While  rule  can 
assume  the  validity  of  documenter  invariants  without  reproving  them. 

We  now  discuss  the  implementation  of  these  changes. 

1)  From  the  definition  of  P  {[Eval  ej  Q,  one  can  write  down  a  sufficient  precondition 
for  e  to  evaluate  without  error.  This  formula  will  be  called  PRECEval  e;  Truel  As  an 
example,  if  the  test  of  a  While  loop  is  f(a)+b$0  and  f  has  the  declaration 

FUNCTION  f(x:  INTEGER):  c:d; 

ENTRY  I(x); 

EXIT  0(x); 

then  the  condition 

PRECEval  f(a)+bsO;  True] 

■  DEF(a)  a  OEF(b)  a  1(a) 

A  (0(a)  A  DCf(tfa))  a  csf(a)sd  s  -MAXINTsf(a)+bsMAXINT) 
is  added  as  an  invariant  of  the  loop. 

2)  The  variable  identifiers  are  divided  into  a  subset  X  which  are  not  changed  in  the 
body  of  the  loop  and  a  subset  Y  which  may  be  changed.  A  set  of  new  unique  variables, 
Y\  is  introduced.  The  extended  form  of  the  frame  rule  is 
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P(X,Y)  s  I(X,Y), 

P(X,Y)aI(X,Y')  ffEval  B(X,Y');  Assume  B(X,Y*);  S(X,Y*)]i  I(X,Y‘), 

P(X,YIaI(X,Y')  lEval  B(X,Y*)jl  -B(X,Y')  =  CKX.Y') 

P(X,Y)  ([Invariant  I(X,Y)  While  B(X,Y)  Do  S(X,Y)]]  CKX.Y) 

where  the  Y  variables  stand  for  the  values  of  variables  before  the  loop  and  the  Y’ 
variables  stand  for  the  values  of  variables  during  or  after  the  loop. 

S)  For  each  variable,  y,  which  can  be  changed  in  the  body,  Lessdef(y,  y1)  can  be 
assumed  to  be  a  valid  invariant. 

4)  Documentor  invariants  D(X,Y,Y’)  can  be  assumed  valid. 

The  final  rule  is: 

P(X,Y)  a  I(X,Y)/\PRE, 

P(X,YW(XtY,)APREALessdef(Y,Y') 

aD(X,Y,Y‘)  ([Eval  B(X,Y');  Assume  B(X,Y')i  S(X,Y')]|  I(X,Y')aPRE, 

PtX.YlAKX.Y'jAPREALessdeffY.Y') 

aD(X,Y,Y')  |[Eval  B(X,Y')]|  -B(X.Y')  3  Q(X.Y') 

-  (WHILE2) 

PCX.Y)  ([Invariant  I(X,Y)  While  B(X,Y)  Do  S(X,Y)]j  CKX.Y) 

where  PRE  is  PRECEval  B;TRUE  J 


Appendix  B:  Simultaneous  Substitution  for  Disjoint  Variables 

In  this  section,  we  present  the  definitions  of  disjointness  for  Pascal  variables  and 
simultaneous  substitution  for  disjoint  Pascal  variables.  To  begin,  we  need  to  define  the 
translation  of  a  Pascal  variable  into  a  standard  representation  as  a  sequence  consisting 
of  a  main  variable  identifier  followed  by  zero  or  more  selectors.  In  the  following, 
<el, . . .  ,en>  denotes  a  sequence  of  n  terms,  and  the  operator  •  stands  for  concatenation 
of  finite  sequences. 
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The  function  Seq(v):  <Pascal  vartable>  -*  <term  sequence>  Is  defined  as  follows: 

Seq(ld)  =  <id>  if  id  is  an  identifier 
Seq(v.f)  »  Seq(v)  •  <.f> 

Seq(vCil)  *  Seq(v)  •  <i> 

Seq(vt)  *  <#t,  v>  where  #t  Is  the  reference  class 

Definition  of  DisJoint(v,  w) 

Let  v  and  w  be  Pascal  variables  and  Seq(v)  =  <v0 . vn>,  Seq(w)  =  <w0, . . .  ,wm>, 

and  assume  msn.  Then  Disjolnt(v,  w)  is  the  following  formula: 

if  vO  and  wO  are  distinct  identifiers,  then  Disjoint(v,  w)  -»  True; 
otherwise,  Disjolnt(v,  w)  -»  (vlxwl  v  . . .  v  vnwwm) 

The  current  implementation  of  Runcheck  uses  a  much  more  restrictive  definition  of 
disjointness  (it  only  compares  vO  and  wO);  this  restriction  is  not  essential  and  will  be 
removed  in  a  later  version. 

Simultaneous  Substitution 

We  can  now  define  a  simultaneous  substitution  of  n  terms  el, . . .  ,en  for  disjoint 
vl, . . .  ,vn.  Let  Seq(vi)  -  <vig, . . .  ,vimj>  for  i  -  1, ...  jn.  Let  tl, . . .  ,tn  and  dij  for 
i  -  I, ...  /i,  j  -  I, . . .  >mi,  be  new  identifiers  not  appearing  in  P,  the  vi  or  the  ei. 

Define  Unseq:  <term  sequence>  -»  <Pascal  variablo  to  be  the  inverse  of  Seq; 
Unseq(Seq(v))  -  v. 
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Then  we  can  define 


P 


vl  vn 

•  •• 

el  en 


Example  B.1:  Simultaneously  swapping  a[i]  with  a[j]  and  changing  i. 


a[i]  a[J]  i 


P(“,U)latJ]  ati]  1+1 


a£d1]  ia£d2]  li  itl  it2  it3 


It 2  It3  la[j]  la[i]  lj+1 


*  PCa.i.J)f{ 

«  P(«a,  [J],  a[i]>,  [0,  a[J]>,  1+1,  J) 


dl  d2 

I  J 


Note  that  «a,  [j],  a£i]>,  [i],  a[j]>  stands  for  the  value  of  the  array  a  after  first  assigning 
the  value  a[i]  to  the  jth  position,  and  assigning  a£j]  to  the  ith  position. 


Example  B.2:  Swapping  two  variables  accessed  by  pointers. 

Consider  the  effect  of  simultaneously  interchanging  xt  and  yt,  where  x  and  y  are 
painter  variables. 
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TYPE  ptr  ■  TINTEGER; 
VAR  X,  y:  PTR; 


P(x,  y,  #INTEGER) 


#INTEGERcX3 

#INTEGERcya 


tINTEGERcya 

#INTEGERcxs 


“  P(x,  y,  «#INTEGER,  cya,  #INTEGERcxs>,  cXa,  #INTEGERcya» 


The  final  value  of  the  reference  class  #INTEGER  is  exactly  analogous  to  the  final  value 
of  the  array  a  in  example  B.l. 
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